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METHODS OF DIAGNOSIS OF PROSTATE CANCER, 
COMPOSITIONS AND METHODS OF SCREENING FOR 
MODULATORS OF PROSTATE CANCER 

5 

CROSS-REFERENCES TO RELATED APPLICATIONS 

This application claims priority from the following applications: USSN 
09/687,576 filed October 13, 2000, USSN 60/276,791 filed March 16, 2001; USSN 
60/288,589, filed May 4, 2001; USSN 09/733,742, filed December 8, 2000; USSN 
1 0 09/733,288, filed December 8, 2000; USSN 09/847,046, filed April 30, 2001 ; USSN 
60/276,888, filed March 16, 2001; USSN 60/286,214, filed April 24, 2001; USSN 
60/281,922, filed April 6, 2001; USSN 60/263,957, filed January 24, 2001, which are 
incorporated herein by reference in their entirety. 

1 5 FIELD OF THE INVENTION 

The invention relates to the identification of nucleic acid and protein 
expression profiles and nucleic acids, products, and antibodies thereto that are involved in 
prostate cancer; and to the use of such expression profiles and compositions in the diagnosis, 
prognosis and therapy of prostate cancer. The invention further relates to methods for 
20 identifying and using agents and/or targets that inhibit prostate cancer. 



BACKGROUND OF THE INVENTION 
Prostate cancer is the most commonly diagnosed internal malignancy and 
second most common cause of cancer death in men in the U.S., resulting in approximately 
25 40,000 deaths each year ( Landis et ah, CA Cancer J. Clin. 48:6-29 (1998); Greenlee et al., 
CA Cancer J. Clin. 50(1):7-13 (2000)), and incidence of prostate cancer has been increasing 
rapidly over the past 20 years in many parts of the world (Nakata et al., Int. J. Urol 
7(7):254-257 (2000); Majeed et al., BJUlnt. 85(9): 1058-1062 (2000)). It develops as the 

1 
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result of a pathologic transformation of normal prostate cells. In tumorigenesis, the cancer 
cell undergoes initiation, proliferation and loss of contact inhibition, culminating in invasion 
of surrounding tissue and, ultimately, metastasis. 

Deaths from prostate cancer are a result of metastasis of a prostate tumor. 
5 Therefore, early detection of the development of prostate cancer is critical in reducing 

mortality from this disease. Measuring levels of prostate-specific antigen (PSA) has become 
a very common method for early detection and screening, and may have contributed to the 
slight decrease in the mortality rate from prostate cancer in recent years (Nowroozi et al, 
Cancer Control 5(6):522-531 (1998)). However, many cases are not diagnosed until the 

10 disease has progressed to an advanced stage. 

Treatments such as surgery (prostatectomy) , radiation therapy, and 
cryotherapy are potentially curative when the cancer remains localized to the prostate. 
Therefore, early detection of prostate cancer is important for a positive prognosis for 
treatment. Systemic treatment for metastatic prostate cancer is limited to hormone therapy 

15 and chemotherapy. Chemical or surgical castration has been the primary treatment for 
symptomatic metastatic prostate cancer for over 50 years. This testicular androgen 
deprivation therapy usually results in stabilization or regression of the disease (in 80% of 
patients), but progression of metastatic prostate cancer eventually develops (Panvichian et al., 
Cancer Control 3(6):493-500 (1996)). Metastatic disease is currently considered incurable, 

20 and the primary goals of treatment are to prolong survival and improve quality of life (Rago, 
Cancer Control 5(6):513-521 (1998)). 

Thus, methods that can be used for diagnosis and prognosis of prostate cancer 
and effective treatment of prostate cancer, and including particularly metastatic prostate 
cancer, would be desirable. Accordingly, provided herein are methods that can be used in 

25 diagnosis and prognosis of prostate cancer. Further provided are methods that can be used to 
screen candidate bioactive agents for the ability to modulate, e.g., treat, prostate cancer. 
Additionally, provided herein are molecular targets and compositions for therapeutic 
intervention in prostate cancer and other cancers. 
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SUMMARY OF THE INVENTION 
The present invention therefore provides nucleotide sequences of genes that 
are up- and down-regulated in prostate cancer cells. Such genes are useful for diagnostic 
purposes, and also as targets for screening for therapeutic compounds that modulate prostate 
5 cancer, such as hormones or antibodies. Other aspects of the invention will become apparent 
to the skilled artisan by the following description of the invention. 

In one aspect, the present invention provides a method of detecting a prostate 
cancer-associated transcript in a cell from a patient, the method comprising contacting a 
biological sample from the patient with a polynucleotide that selectively hybridizes to a 
10 sequence at least 80% identical to a sequence as shown in Tables 1-16. 

In one embodiment, the present invention provides a method of determining 
the level of a prostate cancer associated transcript in a cell from a patient. 

In one embodiment, the present invention provides a method of detecting a 
prostate cancer-associated transcript in a cell from a patient, the method comprising 
15 contacting a biological sample from the patient with a polynucleotide that selectively 
hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16. 

In one embodiment, the polynucleotide selectively hybridizes to a sequence at 
least 95% identical to a sequence as shown in Tables 1-16. In another embodiment, the 
polynucleotide comprises a sequence as shown in Tables 1-16. 
20 In one embodiment, the biological sample is a tissue sample. In another 

embodiment, the biological sample comprises isolated nucleic acids, e.g., mRNA. 

In one embodiment, the polynucleotide is labeled, e.g., with a fluorescent 

label. 

In one embodiment, the polynucleotide is immobilized on a solid surface. 
25 In one embodiment, the patient is undergoing a therapeutic regimen to treat 

prostate cancer. In another embodiment, the patient is suspected of having metastatic 
prostate cancer. 

In one embodiment, the patient is a human. 

In one embodiment, the patient is suspected of having a taxol-resistant cancer. 
30 In one embodiment, the prostate cancer associated transcript is mRNA. 
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In one embodiment, the method further comprises the step of amplifying 
nucleic acids before the step of contacting the biological sample with the polynucleotide. 

In another aspect, the present invention provides a method of monitoring the 
efficacy of a therapeutic treatment of prostate cancer, the method comprising the steps of: (i) 
5 providing a biological sample from a patient undergoing the therapeutic treatment; and (ii) 
determining the level of a prostate cancer-associated transcript in the biological sample by 
contacting the biological sample with a polynucleotide that selectively hybridizes to a 
sequence at least 80% identical to a sequence as shown in Tables 1-16, thereby monitoring 
the efficacy of the therapy. In a further embodiment, the patient has metastatic prostate 
10 cancer. In a further embodiment, the patient has a drug resistant (e.g., taxol resistant) form of 
prostate cancer. 

In one embodiment, the method further comprises the step of: (iii) comparing 
the level of the prostate cancer-associated transcript to a level of the prostate cancer- 
associated transcript in a biological sample from the patient prior to, or earlier in, the 

15 therapeutic treatment. 

Additionally, provided herein is a method of evaluating the effect of a 
candidate prostate cancer drug comprising administering the drug to a patient and removing a 
cell sample from the patient. The expression profile of the cell is then determined. This 
method may further comprise comparing the expression profile to an expression profile of a 

20 healthy individual. In a preferred embodiment, said expression profile includes a gene of 
Tables 1-16. 

In one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1-16. 

In one embodiment, an expression vector or cell comprises the isolated nucleic 

25 acid. 

In one aspect, the present invention provides an isolated polypeptide which is 
encoded by a nucleic acid molecule having polynucleotide sequence as shown in Tables 1-16. 

In another aspect, the present invention provides an antibody that specifically 
binds to an isolated polypeptide which is encoded by a nucleic acid molecule having 
30 polynucleotide sequence as shown in Tables 1-16. 
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In one embodiment, the antibody is conjugated to an effector component, e.g., 
a fluorescent label, a radioisotope or a cytotoxic chemical. 

In one embodiment, the antibody is an antibody fragment. In another 
embodiment, the antibody is humanized. 
5 In one aspect, the present invention provides a method of detecting a prostate 

cancer cell in a biological sample from a patient, the method comprising contacting the 
biological sample with an antibody as described herein. 

In another aspect, the present invention provides a method of detecting 
antibodies specific to prostate cancer in a patient, the method comprising contacting a 
10 biological sample from the patient with a polypeptide encoded by a nucleic acid comprising a 
sequence from Tables 1-16. 

In another aspect, the present invention provides a method for identifying a 
compound that modulates a prostate cancer-associated polypeptide, the method comprising 
the steps of: (i) contacting the compound with a prostate cancer-associated polypeptide, the 
15 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 
80% identical to a sequence as shown in Tables 1-16; and (ii) determining the functional 
effect of the compound upon the polypeptide. s 
In one embodiment, the functional effect is a physical effect, an enzymatic 
effect, or a chemical effect. 
20 In one embodiment, the polypeptide is expressed in a eukaryotic host cell or 

cell membrane. In another embodiment, the polypeptide is recombinant. 

In one embodiment, the functional effect is determined by measuring ligand 
binding to the polypeptide. 

In another aspect, the present invention provides a method of inhibiting 
25 proliferation of a prostate cancer-associated cell to treat prostate cancer in a patient, the 

method comprising the step of administering to the subject a therapeutically effective amount 
of a compound identified as described herein. 

In one embodiment, the compound is an antibody. 
In another aspect, the present invention provides a drug screening assay 
30 comprising the steps of: (i) administering a test compound to a mammal having prostate 

cancer or to a cell sample isolated therefrom; (ii) comparing the level of gene expression of a 

5 
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polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 
as shown in Tables 1-16 in a treated cell or mammal with the level of gene expression of the 
polynucleotide in a control cell sample or mammal, wherein a test compound that modulates 
the level of expression of the polynucleotide is a candidate for the treatment of prostate 
5 cancer. 

In one embodiment, the control is a mammal with prostate cancer or a cell 
sample therefrom that has not been treated with the test compound. In another embodiment, 
the control is a normal cell or mammal. 

In one embodiment, the test compound is administered in varying amounts or 
10 concentrations. In another embodiment, the test compound is administered for varying time 
periods. In another embodiment, the comparison can occur after addition or removal of the 
drug candidate. 

In one embodiment, the levels of a plurality of polynucleotides that selectively 

hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1-16 are 
15 individually compared to their respective levels in a control cell sample or mammal. In a 

preferred embodiment the plurality of polynucleotides is from three to ten. 

In another aspect, the present invention provides a method for treating a 

mammal having prostate cancer comprising administering a compound identified by the 

assay described herein. 
20 In another aspect, the present invention provides a pharmaceutical 

composition for treating a mammal having prostate cancer, the composition comprising a 

compound identified by the assay described herein and a physiologically acceptable 

excipient. 

In one aspect, the present invention provides a method of screening drug 
25 candidates by providing a cell expressing a gene that is up- and down-regulated as in a 
prostate cancer. In one embodiment, a gene is selected from Tables 1-16. The method 
further includes adding a drug candidate to the cell and determining the effect of the drug 
candidate on the expression of the expression profile gene. 

In one embodiment, the method of screening drug candidates includes 
30 comparing the level of expression in the absence of the drug candidate to the level of 
expression in the presence of the drug candidate, wherein the concentration of the drug 
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candidate can vary when present, and wherein the comparison can occur after addition or 
removal of the drug candidate. In a preferred embodiment, the cell expresses at least two 
expression profile genes. The profile genes may show an increase or decrease. 

Also provided is a method of evaluating the effect of a candidate prostate 
5 cancer drug comprising administering the drug to a transgenic animal expressing or 

over-expressing the prostate cancer modulatory protein, or an animal lacking the prostate 
cancer modulatory protein, for example as a result of a gene knockout. 

Moreover, provided herein is a biochip comprising one or more nucleic acid 
segments of Tables 1-16, wherein the biochip comprises fewer than 1000 nucleic acid probes. 

10 Preferably, at least two nucleic acid segments are included. More preferably, at least three 
nucleic acid segments are included. 

Furthermore, a method of diagnosing a disorder associated with prostate 
cancer is provided. The method comprises determining the expression of a gene of Tables 1- 
16, in a first tissue type of a first individual, and comparing the distribution to the expression 

15 of the gene from a second normal tissue type from the first individual or a second unaffected 
individual. A difference in the expression indicates that the first individual has a disorder 
associated with prostate cancer. 

In a further embodiment, the biochip also includes a polynucleotide sequence 
of a gene that is not up- and down-regulated in prostate cancer. 

20 In one embodiment a method for screening for a bioactive agent capable of 

interfering with the binding of a prostate cancer modulating protein (prostate cancer 
modulatory protein) or a fragment thereof and an antibody which binds to said prostate 
cancer modulatory protein or fragment thereof. In a preferred embodiment, the method 
comprises combining a prostate cancer modulatory protein or fragment thereof, a candidate 

25 bioactive agent and an antibody which binds to said prostate cancer modulatory protein or 
fragment thereof. The method further includes determining the binding of said prostate 
cancer modulatory protein or fragment thereof and said antibody. Wherein there is a change 
in binding, an agent is identified as an interfering agent. The interfering agent can be an 
agonist or an antagonist. Preferably, the agent inhibits prostate cancer. 

30 Also provided herein are methods of eliciting an immune response in an 

individual. In one embodiment a method provided herein comprises administering to an 

7 
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individual a composition comprising a prostate cancer modulating protein, or a fragment 
thereof. In another embodiment, the protein is encoded by a nucleic acid selected from those 
of Tables 1-16. 

Further provided herein are compositions capable of eliciting an immune 
5 response in an individual. In one embodiment, a composition provided herein comprises a 
prostate cancer modulating protein, preferably encoded by a nucleic acid of Tables 1-16, or a 
fragment thereof, and a pharmaceutically acceptable carrier. In another embodiment, said 
composition comprises a nucleic acid comprising a sequence encoding a prostate cancer 
modulating protein, preferably selected from the nucleic acids of Tables 1-16, and a 

10 pharmaceutically acceptable carrier. 

Also provided are methods of neutralizing the effect of a prostate cancer 
protein, or a fragment thereof, comprising contacting an agent specific for said protein with 
said protein in an amount sufficient to effect neutralization. In another embodiment, the 
protein is encoded by a nucleic acid selected from those of Tables 1-16. 

15 In another aspect of the invention, a method of treating an individual for 

prostate cancer is provided. In one embodiment, the method comprises administering to said 
individual an inhibitor of a prostate cancer modulating protein. In another embodiment, the 
method comprises administering to a patient having prostate cancer an antibody to a prostate 
cancer modulating protein conjugated to a therapeutic moiety. Such a therapeutic moiety can 

20 be a cytotoxic agent or a radioisotope. 

DETAILED DESCRIPTION OF THE INVENTION 
In accordance with the objects outlined above, the present invention provides 
novel methods for diagnosis and prognosis evaluation for prostate cancer (PC), including 
25 metastatic prostate cancer, as well as methods for screening for compositions which modulate 
prostate cancer. Also provided are methods for treating prostate cancer. 

In addition to the other nucleic acid and peptide sequences, the present 
invention also relates to the identification of PAA2 as a gene that is highly over expressed in 
prostate cancer patient tissues. PAA2 sequence is identical to the zinc transporter ZNT4. 
30 Results presented herein demonstrate that PAA2/ZNT4 is highly expressed in prostate cancer 
cells. The prostate gland is unique in that it has the highest capacity of any organ in the body 
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to accumulate zinc. Zinc uptake is regulated by prolactin and testosterone, which induce the 
expression of a member of the ZIP family of zinc transporters (Costello et al., 1999, J. Biol. 
Chem. 274:17499-17504). Zinc accumulation in the prostate functions to inhibit citrate 
oxidation, which results in a decrease in cellular ATP production (Costello and Franklin, 
5 1998, Prostate 35:285-296). Cancer cells are more sensitive to decreased ATP production 
and have evolved to prevent zinc accumulation. Without wishing to be bound by theory, the 
up-regulation of ZNT4 in prostate cancer cells may result in protection of the cells from high 
zinc levels by its ability to pump accumulated zinc out of the cells. 

The present invention also relates to nucleic acid sequencess encoding PBH1. 

10 PBH1 is related to human TRPC7 (transient receptor potential-related channels, NP_003298), 
a putative calcium channel highly expressed in brain (Nagamine et al., Genomics 54:124-131 
(1998)). Trp is related to melastatin, a gene down-regulated in metastatic melanomas 
(Duncan et al., Cancer Res. 58:1515-1520 (1998)), andMTRl, a gene localized to within the 
Beckwith-Wiedemann syndrome/Wilm's tumor susceptability region (Prawitt et al., Hum. 

15 Mol. Genet. 9:203-216 (2000)). Without wishing to be bound by theory, it is believed that 
PBH1 functions as a calcium channel. 

As a calcium channel, PBH1 is an ideal target for a small molecule 
therapeutic, or a therapeutic antibody that disrupts channel function. CD20, the target of 
Rituximab in non-Hodgekin's lymphoma (Maloney et al., Blood 90:2188-2195 (1997); Leget 

20 and Czuczman, Curr. Opin. Oncol. 10:548-551 (1998)), is a plasma membrane calcium 
channel expressed in B cells (Tedder and Engel, Immunol. Today 15:450-454 (1994)). 
. Similarly, a small molecule, or antibody that inhibits or alters a calcium signal mediated by 
PBH1, will result in the death of prostate cancer cells. 

PBH1, and other genes of the invention, are also be useful as targets for 

25 cytotoxic T-lymphocytes. Genes that are tumor specific, or that are expressed in immune- 
privileged organs, are currently being used as potential vaccine targets (Van den Eynde and 
Boon, Int. J. Clin. Lab. Res. 27:81-86 (1997)). The expression pattern of PBH1 indicates that 
it is an ideal target for cytotoxic T-lymphocytes. Thus, therapies that utilize PBHl-specific 
cytotoxic T-lymphocytes to induce prostate cancer cell death are also provided by this 

30 invention. See, e.g., U.S. Patent No. 6,051,227 and WO 00/32231, the disclosures of which 
are herein incorporated by reference. 
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The present invention is also related to the identification of PAA3 as a gene 
that is important in the modulation of prostate cancer and or breast cancer. 

Tables 1-16 provide unigene cluster identification numbers, exemplar 
accession numbers, or genomic nucleotide position numbers for the nucleotide sequence of 
5 genes that exhibit increased or decreased expression in prostate cancer samples. 



Definitions 

The term "prostate cancer protein" or "prostate cancer polynucleotide" or 
"prostate cancer-associated transcript" refers to nucleic acid and polypeptide polymorphic 

10 variants, alleles, mutants, and interspecies homologues that: (1) have a nucleotide sequence 
that has greater than about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 
90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater nucleotide 
sequence identity, preferably over a region of over a region of at least about 25, 50, 100, 200, 
500, 1000, or more nucleotides, to a nucleotide sequence of or associated with a unigene 

1 5 cluster of Tables 1-16; (2) bind to antibodies, e.g., polyclonal antibodies, raised against an 
immunogen comprising an amino acid sequence encoded by a nucleotide sequence of or 
associated with a unigene cluster of Tables 1-16, and conservatively modified variants 
thereof; (3) specifically hybridize under stringent hybridization conditions to a nucleic acid 
sequence, or the complement thereof of Tables 1-16 and conservatively modified variants 

20 thereof or (4) have an amino acid sequence that has greater than about 60% amino acid 

sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 
96%, 97%, 98% or 99% or greater amino sequence identity, preferably over a region of over 
a region of at least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid 
sequence encoded by a nucleotide sequence of or associated with a unigene cluster of Tables 

25 1-16. A polynucleotide or polypeptide sequence is typically from a mammal including, but 
not limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, 
or other mammal. A "prostate cancer polypeptide" and a "prostate cancer polynucleotide," 
include both naturally occurring or recombinant forms. 

A "full length" prostate cancer protein or nucleic acid refers to a prostate 

30 cancer polypeptide or polynucleotide sequence, or a variant thereof, that contains all of the 
elements normally contained in one or more naturally occurring, wild type prostate cancer 

10 
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polynucleotide or polypeptide sequences. For example, a full length prostate cancer nucleic 
acid will typically comprise all of the exons that encode for the full length, naturally ocurring 
protein. The "full length" may be prior to, or after, various stages of post- translation 
processing or splicing, including alternative splicing. 
5 "Biological sample" as used herein is a sample of biological tissue or fluid that 

contains nucleic acids or polypeptides, e.g., of a prostate cancer protein, polynucleotide or 
transcript. Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

10 blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. Biological samples also 
include explants and primary and/or transformed cell cultures derived from patient tissues. A 
biological sample is typically obtained from a eukaryotic organism, most preferably a 
mammal such as a primate e.g., chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea 
pig, rat, mouse; rabbit; or a bird; reptile; or fish. 

15 "Providing a biological sample" means to obtain a biological sample for use in 

methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 
methods of the invention in vivo. Archival tissues, having treatment or outcome history, will 

20 be particularly useful. 

The terms "identical" or percent "identity," in the context of two or more 
nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that 
are the same or have a specified percentage of amino acid residues or nucleotides that are the 
same (i.e., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 

25 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aligned for maximum correspondence over a comparison window or designated region) as 
measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to 

30 be "substantially identical" This definition also refers to, or may be applied to, the 

compliment of a test sequence. The definition also includes sequences that have deletions 
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and/or additions, as well as those that have substitutions, as well as naturally occurring, e.g., 
polymorphic or allelic variants, and man-made variants. As described below, the preferred 
algorithms can account for gaps and the like. Preferably, identity exists over a region that is 
at least about 25 amino acids or nucleotides in length,' or more preferably over a region that is 
5 50-100 amino acids or nucleotides in length. 

For sequence comparison, typically one sequence acts as a reference sequence, 
to which test sequences are compared. When using a sequence comparison algorithm, test 
and reference sequences are entered into a computer, subsequence coordinates are designated, 
if necessary, and sequence algorithm program parameters are designated. Preferably, default 

10 program parameters can be used, or alternative parameters can be designated. The sequence 
comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of 
one of the number of contiguous positions selected from the group consisting typically of 

15 from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which 
a sequence may be compared to a reference sequence of the same number of contiguous 
positions after the two sequences are optimally aligned. Methods of alignment of sequences 
for comparison are well-known in the art. Optimal alignment of sequences for comparison 
can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. 

20 Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol 
Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l 
Acad. ScL USA 85:2444 (1988), by computerized implementations of these algorithms 
(GAP, BESTFTT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, 
Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and 

25 visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et ah, eds. 
1995 supplement)). 

Preferred examples of algorithms that are suitable for determining percent 
sequence identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, 
which are described in Altschul et al, Nuc. Acids Res. 25:3389-3402 (1977) and Altschul et 

30 al, J. Mol Biol 215:403-410 (1990). BLAST and BLAST 2.0 are used, with the parameters 
described herein, to determine percent sequence identity for the nucleic acids and proteins of 

12 



WO 02/30268 



PCT/US01/32045 



the invention. Software for perfonming BLAST analyses is publicly available through the 
National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 
algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive- valued 
5 threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul et al, supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 
them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 

10 for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 
hits in each direction are halted when: the cumulative alignment score falls off by the 
quantity X from its maximum achieved value; the cumulative score goes to zero or below, 

15 due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 
uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 

20 defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff & Henikoff, Proc. Natl Acad, ScL USA 89:10915 (1989)) alignments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

The BLAST algorithm also performs a statistical analysis of the similarity 
between two sequences (see, e.g., Karlin & Altschul, Proc. Nat 'I. Acad. Sci. USA 90:5873- 

25 5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest 
sum probability (P(N)X which provides an indication of the probability by which a match 
between two nucleotide or amino acid sequences would occur by chance. For example, a 
nucleic acid is considered similar to a reference sequence if the smallest sum probability in a 
comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more 

30 preferably less than about 0.01, and most preferably less than about 0.001. Log values may 
be large negative numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 1 10, 150, 170, etc. 
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An indication that two nucleic acid sequences or polypeptides are substantially 
identical is that the polypeptide encoded by the first nucleic acid is immunologically cross 
reactive with the antibodies raised against the polypeptide encoded by the second nucleic 
acid, as described below. Thus, a polypeptide is typically substantially identical to a second 
5 polypeptide, e.g., where the two peptides differ only by conservative substitutions. Another 
indication that two nucleic acid sequences are substantially identical is that the two molecules 
or their complements hybridize to each other under stringent conditions, as described below. 
Yet another indication that two nucleic acid sequences are substantially identical is that the 
same primers can be used to amplify the sequences. 

10 A "host cell" is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like {see, e.g., the American Type Culture 

15 Collection catalog or web site, www.atcc.org). 

The terms "isolated," "purified," or "biologically pure" refer to material that is 
substantially or essentially free from components that normally accompany it as found in its 
native state.. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 

20 chromatography. A protein or nucleic acid that is the predominant species present in a 

preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 

25 that the nucleic acid or protein is at least 85% pure, more preferably at least 95% pure, and 
most preferably at least 99% pure. "Purify" or "purification" in other embodiments means 
removing at least one contaminant from the composition to be purified. In this sense, 
purification does not require that the purified compound be homogenous, e.g., 100% pure. 

The terms "polypeptide," "peptide" and "protein" are used interchangeably 

30 herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers 
in which one or more amino acid residue is an artificial chemical mimetic of a corresponding 
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naturally occurring amino acid, as well as to naturally occurring amino acid polymers, those 
containing modified residues, and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, 
as well as amino acid analogs and amino acid mimetics that function similarly to the naturally 
5 occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 

10 norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 

modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic 
chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to 
chemical compounds that have a structure that is different from the general chemical 
structure of an amino acid, but that functions similarly to a naturally occurring amino acid. 

15 Amino acids may be referred to herein by either their commonly known three 

letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic 

20 acid sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 

25 most proteins. For instance, the codons GCA, GCC, GCG and GCU all encode the amino 
acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can 
be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
conservatively modified variations. Every nucleic acid sequence herein which encodes a 

30 polypeptide also describes silent variations of the nucleic acid. One of skill will recognize 
that in certain contexts each codon in a nucleic acid (except AUG, which is ordinarily the 
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only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can 
be modified to yield a functionally identical molecule. Accordingly, often silent variations of 
a nucleic acid which encodes a polypeptide is implicit in a described sequence with respect to 
the expression product, but not with respect to actual probe sequences. 
5 As to amino acid sequences, one of skill will recognize that individual 

substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 
sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 

10 Conservative substitution tables providing functionally similar amino acids are well known in 
the art. Such conservatively modified variants are in addition to and do not exclude 
polymorphic variants, interspecies homologs, and alleles of the invention.typically 
conservative substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), 
Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) 

15 Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), 
Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, 
e.g., Creighton, Proteins (1984)). 

Macromolecular structures such as polypeptide structures can be described in 
terms of various levels of organization. For a general discussion of this organization, see, 

20 e.g., Alberts et al 9 Molecular Biology of the Cell (3 rd ed., 1994) and Cantor & Schimmel, 
Biophysical Chemistry Parti: The Conformation of Biological Macromolecules (1980). 
"Primary structure" refers to the amino acid sequence of a particular peptide. "Secondary 
structure" refers to locally ordered, three dimensional structures within a polypeptide. These 
structures are commonly known as domains. Domains are portions of a polypeptide that 

25 often form a compact unit of the polypeptide and are typically 25 to approximately 500 
amino acids long. Typical domains are made up of sections of lesser organization such as 
stretches of P-sheet and oc-helices. tertiary structure" refers to the complete three 
dimensional structure of a polypeptide monomer. "Quaternary structure" refers to the three 
dimensional structure formed, usually by the noncovalent association of independent tertiary 

30 units. Anisotropic terms are also known as energy terms. 
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"Nucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical 
equivalents used herein means at least two nucleotides covalently linked together. 
Oligonucleotides are typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more 
nucleotides in length, up to about 100 nucleotides in length. Nucleic acids and 
5 polynucleotides are a pQlymers of any length, including longer lengths, e.g., 200, 300, 500, 
1000, 2000, 3000, 5000, 7000, 10,000, etc. A nucleic acid of ihe present invention will 
generally contain phosphodiester bonds, although in some cases, nucleic acid analogs are 
included that may have alternate backbones, comprising, e.g., phosphoramidate, 
phosphorothioate, phosphorodithioate, or O-methylphophoroamidite linkages (see Eckstein, 

10 Oligonucleotides and Analogues: A Practical Approach, Oxford University Press); and 

peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with 
positive backbones; non-ionic backbones, and non-ribose backbones, including those 
described in U.S. Patent Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC 
Symposium Series 580, Carbohydrate Modifications in Antisense Research, Sanghui & 

15 Cook, eds.. Nucleic acids containing one or more carbocyclic sugars are also included within 
one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done 
for a variety of reasons, e.g. to increase the stability and half-life of such molecules in 
physiological environments or as probes on a biochip. Mixtures of naturally occurring 
nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid 

20 analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. 

A variety of references disclose such nucleic acid analogs, including, for 
example, phosphoramidate (Beaucage et al., Tetrahedron 49(10):1925 (1993) and references 
therein; Letsinger, J. Org. Chem. 35:3800 (1970); Sprinzl et al., Eur. J. Biochem. 81:579 
(1977); Letsinger et al., Nucl. Acids Res. 14:3487 (1986); Sawai et al, Chem. Lett. 805 

25 (1984), Letsinger et al., J. Am. Chem. Soc. 1 10:4470 (1988); and Pauwels et al., Chemica 
Scripta 26:141 91986)), phosphorothioate (Mag et al., Nucleic Acids Res. 19:1437 (1991); 
and U.S. Patent No. 5,644,048), phosphorodithioate (Briu et al, J. Am. Chem. Soc. 111:2321 
(1989), O-methylphophoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: 
A Practical Approach, Oxford University Press), and peptide nucleic acid backbones and 

30 linkages (see Egholm, J. Am. Chem. Soc. 1 14:1895 (1992); Meier et al., Chem. Int. Ed. Engl. 
31:1008 (1992); Nielsen, Nature, 365:566 (1993); Carlsson et al., Nature 380:207 (1996), all 
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of which are incorporated by reference). Other analog nucleic acids include those with 
positive backbones (Denpcy et al., Proc. Natl. Acad. Sci. USA 92:6097 (1995); non-ionic 
backbones (U.S. Patent Nos. 5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; 
Kiedrowshi et ah, Angew. Chem. Ind. Ed. English 30:423 (1991); Letsinger et al., J. Am. 
5 Chem. Soc. 110:4470 (1988); Letsinger et al., Nucleoside & Nucleotide 13:1597 (1994); 
Chapters 2 and 3, ASC Symposium Series 580, "Carbohydrate Modifications in Antisense 
Research", Ed. Y.S. Sanghui and P. Dan Cook; Mesmaeker et al., Bioorganic & Medicinal 
Chem. Lett 4:395 (1994); Jeffs et al., J. Biomolecular NMR 34:17 (1994); Tetrahedron Lett. 
37:743 (1996)) and non-ribose backbones, including those described in U.S. Patent Nos. 

10 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, "Carbohydrate 
Modifications in Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook. Nucleic acids 
containing one or more carbocyclic sugars are also included within one definition of nucleic 
acids (see Jenkins et al., Chem. Soc. Rev. (1995) pp 169-176). Several nucleic acid analogs 
are described in Rawls, C & E News June 2, 1997 page 35. All of these references are hereby 

15 expressly incorporated by reference. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide 
nucleic acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 

20 kinetics. PNAs have larger changes in the melting temperature (Tm) for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4°C drop in T m for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9°C. Similarly, 
due to their non-ionic nature, hybridization of the bases attached to these backbones is 
relatively insensitive to salt concentration. In addition, PNAs are not degraded by cellular 

25 enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or 
contain portions of both double stranded or single stranded sequence. As will be appreciated 
by those in the art, the depiction of a single strand also defines the sequence of the 
complementary strand; thus the sequences described herein also provide the complement of 

30 the sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, 
where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and 
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combinations of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, 
xanthine hypoxanthine, isocytosine, isoguanine, etc. 'Transcript" typically refers to a 
naturally occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA. As used herein, the term 
"nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 
5 nucleosides such as amino modified nucleosides. In addition, "nucleoside" includes non- 
naturally occurring analog structures. Thus, e.g. the individual units of a peptide nucleic 
acid, each containing a base, are referred to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by 
spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical 

10 means. For example, useful labels include fluorescent dyes, electron-dense reagents, enzymes 
{e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins or other 
entities which can be made detectable, e.g., by incorporating a radiolabel into the peptide or 
used to detect antibodies specifically reactive with the peptide. The radioisotope may be, for 
example, 3H, 14C, 32P, 35S, or 1251. In some cases, particularly using antibodies against the 

15 proteins of the invention, the radioisotopes are used as toxic moieties, as described below. 
The labels may be incorporated into the prostate cancer nucleic acids, proteins and antibodies 
at any position. Any method known in the art for conjugating the antibody to the label may 
be employed, including those methods described by Hunter et al., Nature , 144:945 (1962); 
David et al., Biochemistry . 13:1014 (1974); Pain et al., J. Immunol. Meth. . 40:219 (1981); 

20 and Nygren, J. Histochem. and Cvtochem. . 30:407 (1982). The lifetime of radiolabeled 
peptides or radiolabeled antibody compositions may extended by the addition of substances 
that stablize the radiolabeled peptide or antibody and protect it from degradation. Any 
substance or combination of substances that stablize the radiolabeled peptide or antibody may 
be used including those substances disclosed in US Patent No. 5,961,955. 

25 An "effector" or "effector moiety" or "effector component" is a molecule that 

is bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 
The "effector" can be a variety of molecules including, e.g., detection moieties including 
radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 

30 tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting "hard" e.g., beta radiation. 
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A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 
covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 
5 using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a 
nucleic acid capable of binding to a target nucleic acid of complementary sequence through 
one or more types of chemical bonds, usually through complementary base pairing, usually 

10 through hydrogen bond formation. As used herein, a probe may include natural (i.e., A, G, C, 
or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe 
may be joined by a linkage other than a phosphodiester bond, so long as it does not 
functionally interfere with hybridization. Thus, e.g., probes may be peptide nucleic acids in 
which the constituent bases are joined by peptide bonds rather than phosphodiester linkages. 

15 It will be understood by one of skill in the art that probes may bind target sequences lacking 
complete complementarity with the probe sequence depending upon the stringency of the 
hybridization conditions. The probes are preferably directly labeled as with isotopes, 
chromophores, lumiphores, chromogens, or indirectly labeled such as with biotin to which a 
streptavidin complex may later bind. By assaying for the presence or absence of the probe, 

20 one can detect the presence or absence of the select sequence or subsequence. Diagnosis or 
prognosis may be based at the genomic level, or at the level of RNA or protein expression. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic 
acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been 
modified by the introduction of a heterologous nucleic acid or protein or the alteration of a 

25 native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, e.g., 
recombinant cells express genes that are not found within the native (non-recombinant) form 
of the cell or express native genes that are otherwise abnormally expressed, under expressed 
or not expressed at all. By the term "recombinant nucleic acid" herein is meant nucleic acid, 
originally formed in vitro, in general, by the manipulation- of nucleic acid, e.g., using 

30 polymerases and endonucleases, in a form not normally found in nature. In this manner, 

operably linkage of different sequences is achieved. Thus an isolated nucleic acid, in a linear 
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form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the 
5 host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
recombinant for the purposes of the invention. Similarly, a "recombinant protein" is a protein 
made using recombinant techniques, i.e., through the expression of a recombinant nucleic 
acid as depicted above. 

10 The term "heterologous" when used with reference to portions of a nucleic 

acid indicates that the nucleic acid comprises two or more subsequences that are not normally 
found in the same relationship to each other in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 
arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 

15 coding region from another source. Similarly, a heterologous protein will often refer to two 
or more subsequences that are not found in the same relationship to each other in nature (e.g., 
a fusion protein). 

A "promoter" is defined as an array of nucleic acid control sequences that 
direct transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic 

20 acid sequences near the start site of transcription, such as, in the case of a polymerase II type 
promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 

25 active under environmental or developmental regulation. The term "operably linked" refers 
to a functional linkage between a nucleic acid expression control sequence (such as a 
promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

30 An "expression vector" is a nucleic acid construct, generated recombinantly or 

synthetically, with a series of specified nucleic acid elements that permit transcription of a 
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particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed operably linked to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 
5 duplexing, or hybridizing of a molecule only to a particular nucleotide sequence that is 
determinative of the presence of the nucleotide sequence, in a heterogeneous population of 
nucleic acids and other biologies (e.g., total cellular or library DNA or RNA). Similarly, the 
phrase "specifically (or selectively) binds" to an antibody or "specifically (or selectively) 
immunoreactive with," when referring to a protein or peptide, refers to a binding reaction that 
10 is determinative of the presence of the protein, in a heterogeneous population of proteins and 
other biologies. Thus, under designated immunoassay or nucleic acid hybridization 
conditions, the specified antibodies or nucleic acid probes bind to a particular protein 
nucleotide sequences at least two times the background and more typically more than 10 to 
100 times background. 

15 Specific binding to an antibody under such conditions requires an antibody 

that is selected for its specificity for a particular protein. For example, polyclonal antibodies 
raised to a particular protein, polymorphic variants, alleles, orthologs, and conservatively 
modified variants, or splice variants, or portions thereof, can be selected to obtain only those 
polyclonal antibodies that are specifically immunoreactive with the desired prostact cancer 

20 protein and not with other proteins. This selection may be achieved by subtracting out 

antibodies that cross-react with other molecules. A variety of immunoassay formats may be 
used to select antibodies specifically immunoreactive with a particular protein. For example, 
solid-phase ELISA immunoassays are routinely used to select antibodies specifically 
immunoreactive with a protein {see, e.g., Harlow & Lane, Antibodies, A Laboratory Manual 

25 (1988) for a description of immunoassay formats and conditions that can be used to 
determine specific immunoreactivity). 

The phrase "stringent hybridization conditions" refers to conditions under 
which a probe will hybridize to its target subsequence, typically in a complex mixture of 
nucleic acids, but to no other sequences. Stringent conditions are sequence-dependent and 

30 will be different in different circumstances. Longer sequences hybridize specifically at 
higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 
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Tijssen, Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic 
Probes, "Overview of principles of hybridization and the strategy of nucleic acid assays" 
(1993). Generally, stringent conditions are selected to be about 5-10°C lower than the 
thermal melting point (T m ) for the specific sequence at a defined ionic strength pH. The T m is 
5 the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% 
of the probes complementary to the target hybridize to the target sequence at equilibrium (as 
the target sequences are present in excess, at T m , 50% of the probes are occupied at 
equilibrium). Stringent conditions will be those in which the salt concentration is less than 
about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other 

10 salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 
50 nucleotides) and at least about 60°C for long probes (e.g., greater than 50 nucleotides). 
Stringent conditions may also be achieved with the addition of destabilizing agents such as 
formamide. For selective or specific hybridization, a positive signal is at least two times 
background} preferably 10 times background hybridization. Exemplary stringent 

15 hybridization conditions can be as following: 50% formamide, 5x SSC, and 1% SDS, 
incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°C, with wash in 0.2x SSC, and 
0.1% SDS at 65°C. For PCR, a temperature of about 36°C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32°C and 48°C 
depending on primer length. For high stringency PCR amplification, a temperature of about 

20 62°C is typical, although high stringency annealing temperatures can range from about 50°C 
to about 65°C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90°C - 95°C for 
30 sec - 2 min., an annealing phase lasting 30 sec. - 2 min., and an extension phase of about 
72°C for 1 - 2 min. Protocols and guidelines for low and high stringency amplification 

25 reactions are provided, e.g., in Innis et at. (1990) PCR Protocols, A Guide to Methods and 
Applications, Academic Press, Inc. N.Y.). 

Nucleic acids that do not hybridize to each other under stringent conditions are 
still substantially identical if the polypeptides which they encode are substantially identical. 
This occurs, e.g., when a copy of a nucleic acid is created using the maximum codon 

30 degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybridize 
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under moderately stringent hybridization conditions. Exemplary "moderately stringent 
hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 
1% SDS at 37°C, and a wash in IX SSC at 45°C. A positive hybridization is at least twice 
background. Those of ordinary skill will readily recognize that alternative hybridization and 

5 wash conditions can be utilized to provide conditions of similar stringency. Additional 

guidelines for determining hybridization parameters are provided in numerous reference, e.g., 
and Current Protocols in Molecular Biology, ed. Ausubel, et al 

The phrase "functional effects" in the context of assays for testing compounds 
that modulate activity of a prostate cancer protein includes the determination of a parameter 

10 that is indirectly or directly under the influence of the prostate cancer protein or nucleic acid, 
e.g., a functional, physical, or chemical effect, such as the ability to decrease prostate cancer. 
It includes ligand binding activity; cell growth on soft agar; anchorage dependence; contact 
inhibition and density limitation of growth; cellular proliferation; cellular transformation; 
growth factor or serum dependence; tumor specific marker levels; invasiveness into Matrigel; 

15 tumor growth and metastasis in vivo\ mRNA and protein expression in cells undergoing 

metastasis, and other characteristics of prostate cancer cells. "Functional effects" include in 
vitro, in vivo, and ex vivo activities. 

By "determining the functional effect" is meant assaying for a compound that 
increases or decreases a parameter that is indirectly or directly under the influence of a 

20 prostate cancer protein sequence, e.g., functional, enzymatic, physical and chemical effects. 
Such functional effects can be measured by any means known to those skilled in the art, e.g., 
changes in spectroscopic characteristics (e.g., fluorescence, absorbance, refractive index), 
hydrodynamic (e.g., shape), chromatographic, or solubility properties for the protein, 
measuring inducible markers or transcriptional activation of the prostate cancer protein; 

25 measuring binding activity or binding assays, e.g. binding to antibodies or other ligands, and 
measuring cellular proliferation. Determination of the functional effect of a compound on 
prostate cancer can also be performed using prostate cancer assays known to those of skill in 
the art such as an in vitro assays, e.g., cell growth on soft agar; anchorage dependence; 
contact inhibition and density limitation of growth; cellular proliferation; cellular 

30 transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and protein 
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expression in cells undergoing metastasis, and other characteristics of prostate cancer cells. 
The functional effects can be evaluated by many means known to those skilled in the art, e.g., 
microscopy for quantitative or qualitative measures of alterations in morphological features, 
measurement of changes in RNA or protein levels for prostate cancer-associated sequences, 
5 measurement of RNA stability, identification of downstream or reporter gene expression 
(CAT, luciferase, 0-gal, GFP and the like), e.g., via chemiluminescence, fluorescence, 
colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

"Inhibitors", •'activators'*, and "modulators" of prostate cancer polynucleotide 
and polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules 

10 or compounds identified using in vitro and in vivo assays of prostate cancer polynucleotide 
and polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally 
block activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate 
the activity or expression of prostate cancer proteins, e.g., antagonists. Antisense nucleic 
acids may seem to inhibit expression and subsequent function of the protein. "Activators" 

15 are compounds that increase, open, activate, facilitate, enhance activation, sensitize, agonize, 
or up regulate prostate cancer protein activity. Inhibitors, activators, or modulators also 
include genetically modified versions of prostate cancer proteins, e.g., versions with altered 
activity, as well as naturally occurring and synthetic ligands, antagonists, agonists, antibodies, 
small chemical molecules and the like. Such assays for inhibitors and activators include, e.g., 

20 expressing the prostate cancer protein in vitro, in cells, or cell membranes, applying putative 
modulator compounds, and then determining the functional effects on activity, as described 
above. Activators and inhibitors of prostate cancer can also be identified by incubating 
prostate cancer cells with the test compound and determining increases or decreases in the 
expression of 1 or more prostate cancer proteins, e.g., 1; 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50 

25 or more prostate cancer proteins, such as prostate cancer proteins encoded by the sequences 
set out in Tables 1-16. 

Samples or assays comprising prostate cancer proteins that are treated with a 
potential activator, inhibitor, or modulator are compared to control samples without the 
inhibitor, activator, or modulator to examine the extent of inhibition. Control samples 

30 (untreated with inhibitors) are assigned a relative protein activity value of 100%. Inhibition 
of a polypeptide is achieved when the activity value relative to the control is about 80%, 
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preferably 50%, more preferably 25-0%. Activation of a prostate cancer polypeptide is 
achieved when the activity value relative to the control (untreated with activators) is 110%, 
more preferably 150%, more preferably 200-500% (i.e., two to five fold higher relative to the 
control), more preferably 1000-3000% higher. 
5 The phrase "changes in cell growth" refers to any change in cell growth and 

proliferation characteristics in vitro or in vivo, such as formation of foci, anchorage 
independence, semi-solid or soft agar growth, changes in contact inhibition and density 
limitation of growth, loss of growth factor or serum requirements, changes in cell 
morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 

10 ability to form or suppress tumors when injected into suitable animal hosts, and/or 

immortalization of the cell. See, e.g., Freshney, Culture of Animal Cells a Manual of Basic 
Technique pp. 231-241 (3 rd ed. 1994). 

'Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor. 
"Cancer cells," "transformed" cells or "transformation" in tissue culture, refers 

15 to spontaneous or induced phenotypic changes that do not necessarily involve the uptake of 
new genetic material. Although transformation can arise from infection with a transforming 
virus and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 
spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 

20 aberrant growth control, nonmorphological changes, and/or malignancy (see, Freshney, 
Culture of Animal Cells a Manual of Basic Technique (3 rd ed. 1994)). 

"Antibody" refers to a polypeptide comprising a framework region from an 
immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 

25 epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. light chains are classified as either kappa or lambda. Heavy chains are classified as 
gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its functional equivalent will be most critical in specificity and affinity of binding. See Paul, 

30 Fundamental Immunology. 
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An exemplary immunoglobulin (antibody) structural unit comprises a 
tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair 
having one "light" (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus 
of each chain defines a variable region of about 100 to 1 10 or more amino acids primarily 
5 responsible for antigen recognition. The terms variable light chain (V L ) and variable heavy 
chain (V H ) refer to these light and heavy chains respectively. 

Antibodies exist, e.g., as intact immunoglobulins or as a number of well- 
characterized fragments produced by digestion with various peptidases. Thus, e.g., pepsin 
digests an antibody below the disulfide linkages in the hinge region to produce F(ab)'2, a 

10 dimer of Fab which itself is a light chain joined to V H -C H 1 by a disulfide bond. The F(ab)' 2 
may be reduced under mild conditions to break the disulfide linkage in the hinge region, 
thereby converting the F(ab)' 2 dimer into an Fab' monomer. The Fab' monomer is 
essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 
1993). While various antibody fragments are defined in terms of the digestion of an intact 

15 antibody, one of skill will appreciate that such fragments may be synthesized de novo either 
chemically or by using recombinant DNA methodology. Thus, the term antibody, as used 
herein, also includes antibody fragments either produced by the modification of whole 
antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single 
chain Fv) or those identified using phage display libraries (see, e.g., McCafferty et al., Nature 

20 348:552-554 (1990)) 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler & Milstein, 
Nature 256:495-497 (1975); Kozbor et al., Immunology Today 4:72 (1983); Cole et al, pp. 
77-96 in Monoclonal Antibodies and Cancer Therapy (1985); Coligan, Current Protocols in 

25 Immunology (1991); Harlow & Lane, Antibodies, A Laboratory Manual (1988); and Goding, 
Monoclonal Antibodies: Principles and Practice (2d ed. 1986)). Techniques for the 
production of single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce 
antibodies to polypeptides of this invention. Also, transgenic mice, or other organisms such 
as other mammals, may be used to express humanized antibodies. Alternatively, phage 

30 display technology can be used to identify antibodies and heteromeric Fab fragments that 
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specifically bind to selected antigens (see, e.g., McCafferty et aL, Nature 348:552-554 
(1990); Marks et a/., Biotechnology 10:779-783 (1992)). 

A "chimeric antibody" is an antibody molecule in which (a) the constant 
region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site 
5 (variable region) is linked to a constant region of a different or altered class, effector function 
and/or species, or an entirely different molecule which confers new properties to the chimeric 
antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable 
region, or a portion thereof, is altered, replaced or exchanged with a variable region having a 
different or altered antigen specificity. 

10 

Identification of prostate cancer-associated sequences 

In one aspect, the expression levels of genes are determined in different 
patient samples for which diagnosis information is desired, to provide expression profiles. 
An expression profile of a particular sample is essentially a "fingerprint" of the state of the 

15 sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 
characteristic of the state of the cell. That is, normal tissue (e.g., normal prostate or other 
tissue) may be distinguished from cancerous or metastatic cancerous tissue of the prostate, or 
prostate cancer tissue or metastatic prostate cancerous tissue can be compared with tissue 

20 samples of prostate and other tissues from surviving cancer patients. By comparing 

expression profiles of tissue in known different prostate cancer states, information regarding 
which genes are important (including both up- and down-regulation of genes) in each of these 
states is obtained. 

The identification of sequences that are differentially expressed in prostate 
25 cancer versus non-prostate cancer tissue allows the use of this information in a number of 

ways. For example, a particular treatment regime may be evaluated: does a chemotherapeutic 
drug act to down-regulate prostate cancer, and thus tumor growth or recurrence, in a 
particular patient. Similarly, diagnosis and treatment outcomes may be done or confirmed by 
comparing patient samples with the known expression profiles. Metastatic tissue can also be 
30 analyzed to determine the stage of prostate cancer in the tissue. Furthermore, these gene 
expression profiles (or individual genes) allow screening of drug candidates with an eye to 
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mimicking or altering a particular expression profile; e.g., screening can be done for drugs 
that suppress the prostate cancer expression profile. This may be done by making biochips 
comprising sets of the important prostate cancer genes, which can then be used in these 
screens. These methods can also be done on the protein basis; that is, protein expression 
5 levels of the prostate cancer proteins can be evaluated for diagnostic purposes or to screen 
candidate agents. In addition, the prostate cancer nucleic acid sequences can be administered 
for gene therapy purposes, including the administration of antisense nucleic acids, or the 
prostate cancer proteins (including antibodies and other modulators thereof) administered as 
therapeutic drugs. 

10 Thus the present invention provides nucleic acid and protein sequences that 

are differentially expressed in prostate cancer, herein termed "prostate cancer sequences." As 
outlined below, prostate cancer sequences include those that are up-regulated (i.e., expressed 
at a higher level) in prostate cancer, as well as those that are down-regulated (i.e., expressed 
at a lower level). In a preferred embodiment, the prostate cancer sequences are from humans; 

15 however, as will be appreciated by those in the art, prostate cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, other 
prostate cancer sequences are provided, from vertebrates, including mammals, including 
rodents (rats, mice, hamsters, guinea pigs, etc.), primates, farm animals (including sheep, 
goats, pigs, cows, horses, etc.) and pets, e.g., (dogs, cats, etc.). Prostate cancer sequences 

20 from other organisms may be obtained using the techniques outlined below. 

Prostate cancer sequences can include both nucleic acid and amino acid 
sequences. As will be appreciated by those in the art and is more fully outlined below, 
prostate cancer nucleic acid sequences are useful in a variety of applications, including 
diagnostic applications, which will detect naturally occurring nucleic acids, as well as 

25 screening applications; e.g., biochips comprising nucleic acid probes or PCR microtiter plates 
with selected probes to the prostate cancer sequences can be generated. 

A prostate cancer sequence can be initially identified by substantial nucleic 
acid and/or amino acid sequence homology to the prostate cancer sequences outlined herein. 
Such homology can be based upon the overall nucleic acid.or amino acid sequence, and is 

30 generally determined as outlined below, using either homology programs or hybridization 
conditions. 
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For identifying prostate cancer-associated sequences, the prostate cancer 
screen typically includes comparing genes identified in different tissues, e.g., normal and 
cancerous tissues, or tumor tissue samples from patients who have metastatic disease vs. non 
metastatic tissue. Other suitable tissue comparisons include comparing prostate cancer 
5 samples with metastatic cancer samples from other cancers, such as lung, breast, 

gastrointestinal cancers, ovarian, etc. Samples of different stages of prostate cancer, e.g., 
survivor tissue, drug resistant states, and tissue undergoing metastasis, are applied to biochips 
comprising nucleic acid probes. The samples are first microdissected, if applicable, and 
treated as is known in the art for the preparation of mRNA. Suitable biochips are 

10 commercially available, e.g. from Affymetrix. Gene expression profiles as described herein 
are generated and the data analyzed. 

In one embodiment, the genes showing changes in expression as between 
normal and disease states are compared to genes expressed in other normal tissues, preferably 
normal prostate, but also including, and not limited to lung, heart, brain, liver, breast, kidney, 

15 muscle, colon, small intestine, large intestine, spleen, bone and placenta. In a preferred 
embodiment, those genes identified during the prostate cancer screen that are expressed in 
any significant amount in other tissues are removed from the profile, although in some 
embodiments, this is not necessary. That is, when screening for drugs, it is usually preferable 
that the target be disease specific, to minimize possible side effects. 

20 In a preferred embodiment, prostate cancer sequences are those that are up- 

regulated in prostate cancer, that is, the expression of these genes is higher in the prostate 
cancer tissue as compared to non-cancerous tissue. "Up-regulation" as used herein often 
means at least about a two-fold change, preferably at least about a three fold change, with at 
least about five-fold or higher being preferred. All unigene cluster identification numbers 

25 and accession numbers herein are for the GenBank sequence database and the sequences of 
the accession numbers are hereby expressly incorporated by reference. GenBank is known in 
the art, see, e.g., Benson, DA, et a/., Nucleic Acids Research 26:1-7 (1998) and 
http://www.ncbi.nlm.nih.gov/. Sequences are also available in other databases, e.g., 
European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). 

30 In another preferred embodiment, prostate cancer sequences are those that are 

down-regulated in prostate cancer; that is, the expression of these genes is lower in prostate 
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cancer tissue as compared to non-cancerous tissue (see, e.g., Tables 8, 12 and 14). "Down- 
regulation" as used herein often means at least about a 1.5-fold change more preferrably a 
two-fold change, preferably at least about a three fold change, with at least about five-fold or 
higher being most preferred. 

5 

Informatics 

The ability to identify genes that are over or under expressed in prostate 
cancer can additionally provide high-resolution, high-sensitivity datasets which can be used 

10 in the areas of diagnostics, therapeutics, drug development, pharmacogenetics, protein 
structure, biosensor development, and other related areas. For example, the expression 
profiles can be used in diagnostic or prognostic evaluation of patients with prostate cancer. 
Or as another example, subcellular toxicological information can be generated to better direct 
drug structure and activity correlation (see Anderson, Pharmaceutical Proteomics: Targets, 

15 Mechanism, and Function, paper presented at the IBC Proteomics conference, Coronado, CA 
(June 11-12, 1998)). Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,811,231). Similar advantages accrue 
from datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, 

20 saccharides, lipids, drugs, and the like). 

Thus, in another embodiment, the present invention provides a database that 
includes at least one set of assay data. The data contained in the database is acquired, e.g., 
using array analysis either singly or in a library format. The database can be in substantially 
any form in which data can be maintained and transmitted, but is preferably an electronic 

25 database. The electronic database of the invention can be maintained on any electronic 
device allowing for the storage of and access to the database, such as a personal computer, 
but is preferably distributed on a wide area network, such as the World Wide Web. 

The focus of the present section on databases that include peptide sequence 
data is for clarity of illustration only. It will be apparent to those of skill in the art that similar 

30 databases can be assembled for any assay data acquired using an assay of the invention. 
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The compositions and methods for identifying and/or quantitating the relative 
and/or absolute abundance of a variety of molecular and macromolecular species from a 
biological sample undergoing prostate cancer, i.e., the identification of prostate cancer- 
associated sequences described herein, provide an abundance of information, which can be 
5 correlated with pathological conditions, predisposition to disease, drug testing, therapeutic 
monitoring, gene-disease causal linkages, identification of correlates of immunity and 
physiological status, among others. Although the data generated from the assays of the 
invention is suited for manual review and analysis, in a prefeiTed embodiment, prior data 
processing using high-speed computers is utilized. 

10 An array of methods for indexing and retrieving biomolecular information is 

known in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational 
database system for storing biomolecular sequence information in a manner that allows 
sequences to be catalogued and searched according to one or more protein function 
hierarchies. U.S. Patent 5,953,727 discloses a relational database having sequence records 

15 containing information in a format that allows a collection of partial-length DNA sequences 
to be catalogued and searched according to association with one or more sequencing projects 
for obtaining full-length sequences from the collection of partial length sequences. U.S. 
Patent 5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 
sequence similar to a sequence data item in a gene database based on the degree of similarity 

20 between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 

using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 
dimensional database comprising a functionality for multi-dimensional data analysis 

25 described as on-line analytical processing (OLAP), which entails the consolidation of 

projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the Fields of each database 
record are divided into two classes, navigational and informational data, with navigational 
fields stored in a hierarchical topological map which can be viewed as a tree structure or as 

30 the merger of two or more such tree structures. 
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See also Mount et al, Bioinformatics (2001); Biological Sequence Analysis: 
Probabilistic Models of Proteins and Nucleic Acids (Durbin et al, eds., 1999); 
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (Baxevanis & 
Oeullette eds., 1998)); Rashidi & Buehler, Bioinformatics: Basic Applications in Biological 
5 Science and Medicine (1999); Introduction to Computational Molecular Biology (Setubal et 
aU eds 1997); Bioinformatics: Methods and Protocols (Misener & Krawetz, eds, 2000); 
Bioinformatics: Sequence, Structure, and Databanks: A Practical Approach (Higgins & 
Taylor, eds., 2000); Brown, Bioinformatics: A Biologist's Guide to Biocomputing and the 
Internet (2001); Han & Kamber, Data Mining: Concepts and Techniques (2000); and 

10 Waterman, Introduction to Computational Biology: Maps, Sequences, and Genomes (1995). 

The present invention provides a computer database comprising a computer 
and software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 
with data specifying the source of the target-containing sample from which each sequence 
specificity record was obtained. 

15 In an exemplary embodiment, at least one of the sources of target-containing 

sample is from a control tissue sample known to be free of pathological disorders. In a 
variation, at least one of the sources is a known pathological tissue specimen, e.g., a 
neoplastic lesion or another tissue specimen to be analyzed for prostate cancer. In another 
variation, the assay records cross-tabulate one or more of the following parameters for each 

20 target species in a sample: (1) a unique identification code, which can include, e.g., a target 
molecular structure and/or characteristic separation coordinate (e.g., electrophoretic 
coordinates); (2) sample source; and (3) absolute and/or relative quantity of the target species 
present in the sample. 

The invention also provides for the storage and retrieval of a collection of 

25 target data in a computer data storage apparatus, which can include magnetic disks, optical 
disks, magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, 
magnetic bubble memory devices, and other data storage devices, including CPU registers 
and on-CPU data storage arrays. Typically, the target data records are stored as a bit pattern 
in an array of magnetic domains on a magnetizable medium or as an array of charge states or 

30 transistor gate states, such as an array of cells in a DRAM device (e.g., each cell comprised of 
a transistor and a charge storage area, which may be on the transistor). In one embodiment, 

33 



WO 02/30268 



PCTYUS01/32045 



the invention provides such storage devices, and computer systems built therewith, 
comprising a bit pattern encoding a protein expression fingerprint record comprising unique 
identifiers for at least 10 target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides 
5 a method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 
or retrieved from a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTFTT) and/or the comparison may 
10 be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM- 
compatible (DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format 
(e.g., Linux, SunOS, Solaris, ATX, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or 
15 hard (fixed, Winchester) disk drive, comprising a bit pattern encoding data from an assay of 
the invention in a file format suitable for retrieval and processing in a computerized sequence 
analysis, comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing 
devices linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, 
20 ISDN line, wireless network, optical fiber, or other suitable signal transmission medium, 
whereby at least one network device (e.g., computer, disk array, etc.) comprises a pattern of 
magnetic domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM 
cells) composing a bit pattern encoding data acquired from an assay of the invention. 

The invention also provides a method for transmitting assay data that includes 
25 generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

In a preferred embodiment, the invention provides a computer system for 
30 comparing a query target to a database containing an array of data structures, such as an assay 
result obtained by the method of the invention, and ranking database targets based on the 
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degree of identity and gap weight to the target data. A central processor is preferably 
initialized to load and execute the computer program for alignment and/or comparison of the 
assay results. Data for a query target is entered into the central processor via an I/O device. 
Execution of the computer program results in the central processor retrieving the assay data 
5 from the data file, which comprises a binary description of an assay result. 

The target data or record and the computer program can be transferred to 
secondary memory, which is typically random access memory (e.g., DRAM, SRAM, 
SGRAM, or SDRAM). Targets are ranked according to the degree of correspondence 
between a selected assay characteristic (e.g., binding to a selected affinity moiety) and the 

10 same characteristic of the query target and results are output via an I/O device. For example, 
a central processor can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, 
PA-8Q00, SPARC, MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or 
public domain molecular biology software package (e.g., UWGCG Sequence Analysis 
Software, Darwin); a data file can be an optical or magnetic disk, a data server, a memory 

15 device (e.g., DRAM, SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, 
etc.); an I/O device can be a terminal comprising a video display and a keyboard, a modem, 
an ISDN terminal adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or 
other suitable I/O device. 

The invention also preferably provides the use of a computer system, such as 

* 

20 that described above, which comprises: (1) a computer; (2) a stored bit pattern encoding a 
collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer, (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. ; 

25 

Characteristics of prostate cancer-associated proteins 

Prostate cancer proteins of the present invention may be classified as secreted 
proteins, transmembrane proteins or intracellular proteins. In one embodiment, the prostate 
cancer protein is an intracellular protein. Intracellular proteins may be found in the 
30 cytoplasm and/or in the nucleus. Intracellular proteins are involved in all aspects of cellular 
function and replication (including, e.g., signaling pathways); aberrant expression of such 
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proteins often results in unregulated or disregulated cellular processes (see, e.g., Molecular 
Biology of the Cell (Alberts, ed., 3rd ed., 1994). For example, many intracellular proteins 
have enzymatic activity such as protein kinase activity, protein phosphatase activity, protease 
activity, nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins 
5 also serve as docking proteins that are involved in organizing complexes of proteins, or 
targeting proteins to various subcellular localizations, and are involved in maintaining the 
structural integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence 
in the proteins of one or more motifs for which defined functions have been attributed. In 

10 addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SID) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

15 targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 
few, have been shown to mediate protein-protein interactions. Some of these may also be 
involved in binding to phospholipids or other second messengers. As will be appreciated by 
one of ordinary skill in the art, these motifs can be identified on the basis of primary 
sequence; thus, an analysis of the sequence of proteins may provide insight into both the 

20 enzymatic potential of the molecule and/or molecules with which the protein may associate. 
One useful database is Pfam (protein families), which is a large collection of multiple 
sequence alignments and hidden Markov models covering many common protein domains. 
Versions are available via the internet from Washington University in St. Louis, the Sanger 
Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman et al, Nuc. 

25 Acids Res. 28:263-266 (2000); Sonnhammer et al., Proteins 28:405-420 (1997); Bateman et 
al, Nuc. Acids Res. 27:260-262 (1999); and Sonnhammer et al, Nuc. Acids Res. 26:320-322- 
(1998)). 

In another embodiment, the prostate cancer sequences are transmembrane 
proteins. Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. 
30 They may have an intracellular domain, an extracellular domain, or both. The intracellular 
domains of such proteins may have a number of functions including those already described 
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for intracellular proteins. For example, the intracellular domain may have enzymatic activity 
and/or may serve as a binding site for additional proteins. Frequently the intracellular 
domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
5 of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

Transmembrane proteins may contain from one to many transmembrane 
domains. For example, receptor tyrosine kinases, certain cytokine receptors, receptor 
guanylyl cyclases and receptor serine/threonine protein kinases contain a single 

10 transmembrane domain. However, various other proteins including channels and adenylyl 
cyclases contain numerous transmembrane domains. Many important cell surface receptors 
such as G protein coupled receptors (GPCRs) are classified as "seven transmembrane 
domain" proteins, as they contain 7 membrane spanning regions. Characteristics of 
transmembrane domains include approximately 20 consecutive hydrophobic amino acids that 

15 may be followed by charged amino acids. Therefore, upon analysis of the amino acid 
sequence of a particular protein, the localization and number of transmembrane domains 
within the protein may be predicted (see, e.g. PSORT web site http://psort.nibb.ac.jp/). 
Important transmembrane protein receptors include, but are not limited to the insulin 
receptor, insulin-like growth factor receptor, human growth hormone receptor, glucose 

20 transporters, transferrin receptor, epidermal growth factor receptor, low density lipoprotein 
receptor, epidermal growth factor receptor, leptin receptor, interleukin receptors, e.g. IL-1 
receptor, IL-2 receptor, 

The extracellular domains of transmembrane proteins are diverse; however, 
conserved motifs are found repeatedly among various extracellular domains. Conserved 

25 structure and/or functions have been ascribed to different extracellular motifs. Many 

extracellular domains are involved in binding to other molecules. In one aspect, extracellular 
domains are found on receptors. Factors that bind the receptor domain include circulating 
ligands, which may be peptides, proteins, or small molecules such as adenosine and the like. 
For example, growth factors such as EGF, FGF and PDGF are circulating growth factors that 

30 bind to their cognate receptors to initiate a variety of cellular responses. Other factors include 
cytokines, mitogenic factors, neurotrophic factors and the like. Extracellular domains also 
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bind to cell-associated molecules. In this respect, they mediate cell-cell interactions. Cell- 
associated ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol (GPI) 
anchor, or may themselves be transmembrane proteins. Extracellular domains also associate 
with the extracellular matrix and contribute to the maintenance of the cell structure. 
5 Prostate cancer proteins that are transmembrane are particularly preferred in 

the present invention as they are readily accessible targets for immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 
in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ. Alternatively, antibodies can also label intracellular proteins, in which case samples are 

10 typically permeablized to provide access to intracellular proteins. 

It will also be appreciated by those in the art that a transmembrane protein can 
be made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

15 In another embodiment, the prostate cancer proteins are secreted proteins; the 

secretion of which can be either constitutive or regulated. These proteins have a signal 
peptide or signal sequence that targets the molecule to the secretory pathway. Secreted 
proteins are involved in numerous physiological events; by virtue of their circulating nature, 
they serve to transmit signals to various other cell types. The secreted protein may function in 

20 an autocrine manner (acting on the cell that secreted the factor), a paracrine manner (acting 
on cells in close proximity to the cell that secreted the factor) or an endocrine manner (acting 
on cells at a distance). Thus secreted molecules find use in modulating or altering numerous 
aspects of physiology. Prostate cancer proteins that are secreted proteins are particularly 
preferred in the present invention as they serve as good targets for diagnostic markers, e.g., 

25 for blood, plasma, serum, or stool tests. 

Use of prostate cancer nucleic acids 

As described above, prostate cancer sequence is initially identified by 
substantial nucleic acid and/or amino acid sequence homology or linkage to the prostate 
30 cancer sequences outlined herein. Such homology can be based upon the overall nucleic acid 
or amino acid sequence, and is generally determined as outlined below, using either 
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homology programs or hybridization conditions. Typically, linked sequences on a mRNA are 
found on the same molecule. 

The prostate cancer nucleic acid sequences of the invention, e.g., the 
sequences in Tables 1-16, can be fragments of larger genes, i.e., they are nucleic acid 
5 segments. "Genes" in this context includes coding regions, non-coding regions, and mixtures 
of coding and non-coding regions. Accordingly, as will be appreciated by those in the art, 
using the sequences provided herein, extended sequences, in either direction, of the prostate 
cancer genes can be obtained, using techniques well known in the art for cloning either longer 
sequences or the full length sequences; see Ausubel, et al, supra. Much can be done by 

10 informatics and many sequences can be clustered to include multiple sequences 
corresponding to a single gene, e.g., systems such as UniGene (see, 
http://www.ncbi.nlm.nih.gov/UniGene/). 

Once the prostate cancer nucleic acid is identified, it can be cloned and, if 
necessary, its constituent parts recombined to form the entire prostate cancer nucleic acid 

15 coding regions or the entire mRNA sequence. Once isolated from its natural source, e.g., 
contained within a plasmid or other vector or excised therefrom as a linear nucleic acid 
segment, the recombinant prostate cancer nucleic acid can be further-used as a probe to 
identify and isolate other prostate cancer nucleic acids, e.g., extended coding regions. It can 
also be used as a "precursor" nucleic acid to make modified or variant prostate cancer nucleic 

20 acids and proteins. 

The prostate cancer nucleic acids of the present invention are used in several 
ways. In a first embodiment, nucleic acid probes to the prostate cancer nucleic acids are 
made and attached to biochips to be used in screening and diagnostic methods, as outlined 
below, or for administration, e.g., for gene therapy, vaccine, and/or antisense applications. 

25 Alternatively, the prostate cancer nucleic acids that include coding regions of prostate cancer 
proteins can be put into expression vectors for the expression of prostate cancer proteins, 
again for screening purposes or for administration to a patient. 

In a preferred embodiment, nucleic acid probes to prostate cancer nucleic 
acids (both the nucleic acid sequences outlined in the figures and/or the complements thereof) 

30 are made. The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the prostate cancer nucleic acids, i.e. the target sequence (either the target 
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sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 
hybridization of the target sequence and the probes of the present invention occurs. As 
outlined below, this complementarity need not be perfect; there may be any number of base 
pair mismatches which will interfere with hybridization between the target sequence and the 
5 single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 
conditions, the sequence is not a complementary target sequence. Thus, by "substantially 
complementary" herein is meant that the probes are sufficiently complementary to the target 
sequences to hybridize under normal reaction conditions, particularly high stringency 

10 conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single 
and partially double stranded. The strandedness of the probe is dictated by the structure, 
composition, and properties of the target sequence. In general, the nucleic acid probes range 
from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 

15 and from about 30 to about 50 bases being particularly preferred. That is, generally whole 
genes are not used. In some embodiments, much longer nucleic acids can be used, up to 
hundreds of bases. 

In a preferred embodiment, more than one probe per sequence is used, with 
either overlapping probes or probes to different sections of the target being used. That is, 

20 two, three, four or more probes, with three being preferred, are used to build in a redundancy 
for a particular target. The probes can be overlapping (i.e., have some sequence in common), 
or separate. In some cases, PGR primers may be used to amplify signal for higher sensitivity. 

As will be appreciated by those in the art, nucleic acids can be attached or 
immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 

25 equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 
removal as outlined below. The binding can typically be covalent or non-covalent. By "non- 
covalent binding" and grammatical equivalents herein is meant one or more of electrostatic, 
hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent 

30 attachment of a molecule, such as, streptavidin to the support and the non-covalent binding of 
the biotinylated probe to the streptavidin. By "covalent binding" and grammatical 
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equivalents herein is meant that the two moieties, the solid support and the probe, are 
attached by at least one bond, including sigma bonds, pi bonds and coordination bonds. 
Covalent bonds can be formed directly between the probe and the solid support or can be 
formed by a cross linker or by inclusion of a specific reactive group on either the solid 
5 support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

In general, the probes are attached to the biochip in a wide variety of ways, as 
will be appreciated by those in the art. As described herein, the nucleic acids can either be 
synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
10 the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid 
support" or other grammatical equivalents herein is meant a material that can be modified to 
contain discrete individual sites appropriate for the attachment or association of the nucleic 
acid probes and is amenable to at least one detection method. As will be appreciated by those 

15 in the art, the number of possible substrates are very large, and include, but are not limited to, 
glass and modified or functionalized glass, plastics (including acrylics, polystyrene and 
copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, 
polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica- 
based materials including silicon and modified silicon, carbon, metals, inorganic glasses, 

20 plastics, etc. In general, the substrates allow optical detection and do not appreciably 

fluoresce. A preferred substrate is described in copending application entitled Reusable Low 
Fluorescent Plastic Biochip, U.S. Application Serial No. 09/270,214, filed March 15, 1999, 
herein incorporated by reference in its entirety. 

Generally the substrate is planar, although as will be appreciated by those in 

25 the art, other configurations of substrates may be used as well. For example, the probes may . 
be placed on the inside surface of a tube, for flow-through sample analysis to minimize 
sample volume. Similarly, the substrate may be flexible, such as a flexible foam, including 
closed cell foams made of particular plastics. 

In a preferred embodiment, the surface of the biochip and the probe may be 

30 derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 
the biochip is derivatized with a chemical functional group including, but not limited to, 
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amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using these functional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g. using linkers as are known in the art; e.g., 
5 homo-or hetero-bifunctional linkers as are well known {see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used. 

In this embodiment, oligonucleotides are synthesized as is known in the art, 

10 and then attached to the surface of the solid support. As will be appreciated by those skilled 
in the art, either the 5' or 3' terminus may be attached to the solid support, or attachment may 
be via an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very 
strong, yet non-covalent. For example, biotinylated oligonucleotides can be made, which 

15 bind to surfaces covalently coated with streptavidin, resulting in attachment. 

Alternatively, the oligonucleotides may be synthesized on the surface, as is 
known in the art. For example, photoactivation techniques utilizing photopolymerization 
compounds and techniques are used. In a preferred embodiment, the nucleic acids can be 
synthesized in situ, using well known photolithographic techniques, such as those described 

20 in WO 95/251 16; WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references 
cited within, all of which are expressly incorporated by reference; these methods of 
attachment form the basis of the Affimetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression 
level of prostate cancer-associated sequences. These assays are typically performed in 

25 conjunction with reverse transcription. In such assays, a prostate cancer-associated nucleic 
acid sequence acts as a template in an amplification reaction (e.g., Polymerase Chain 
Reaction, or PGR). In a quantitative amplification, the amount of amplification product will 
be proportional to the amount of template in the original sample. Comparison to appropriate 
controls provides a measure of the amount of prostate cancer-associated RNA. Methods of 

30 quantitative amplification are well known to those of skill in the art. Detailed protocols for 
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quantitative PCR are provided, e.g., in Innis et al, PCR Protocols, A Guide to Methods and 
Applications (1990). 

In some embodiments, a TaqMan based assay is used to measure expression. 
TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 

5 dye and a 3' quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3* end. When the PCR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5' fluorescent dye and the 3' 
quenching agent, thereby resulting in an increase in fluorescence as a function of 

10 amplification (see, e.g., literature provided by Perkin-Elmer, e.g., www2.perkin-elmer.com). 

Other suitable amplification methods include, but are not limited to, ligase 
chain reaction (LCR) (see Wu & Wallace, Genomics 4:560 (1989), Landegren et al 9 Science 
241:1077 (1988), and Barringer et al, Gene 89:117 (1990)), transcription amplification 
(Kwoh et al 9 Proc. Natl. Acad. ScL USA 86:1 173 (1989)), self-sustained sequence replication 

15 (Guatelli etal, Proc. Nat. Acad. Sci. USA 87:1874 (1990)), dot PCR, and linker adapter PCR, 
etc. 

Expression of prostate cancer proteins from nucleic acids 

In a preferred embodiment, prostate cancer nucleic acids, e.g., encoding 

20 prostate cancer proteins are used to make a variety of expression vectors to express prostate 
cancer proteins which can then be used in screening assays, as described below. Expression 
vectors and recombinant DNA technology are well known to those of skill in the art (see, 
e.g., Ausubel, supra, and Gene Expression Systems (Fernandez & Hoeffler, eds, 1999)) and 
are used to express proteins. The expression vectors may be either self-replicating 

25 extrachromosomal vectors or vectors which integrate into a host genome. Generally, these 
expression vectors include transcriptional and translational regulatory nucleic acid operably 
linked to the nucleic acid encoding the prostate cancer protein. The term "control sequences" 
refers to DNA sequences used for the expression of an operably linked coding sequence in a 
particular host organism. Control sequences that are suitable for prokaryotes, e.g., include a 

30 promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are 
known to utilize promoters, polyadenylation signals, and enhancers. 
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Nucleic acid is "operably linked" when it is placed into a functional 
relationship with another nucleic acid sequence. For example, DNA for a presequence or 
secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein 
that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked 

5 to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site 
is operably linked to a coding sequence if it is positioned so as to facilitate translation. 
Generally, "operably linked" means that the DNA sequences being linked are contiguous, 
and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers 
do not have to be contiguous. Linking is typically accomplished by ligation at convenient 

10 restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are 
used in accordance with conventional practice. Transcriptional and translational regulatory 
nucleic acid will generally be appropriate to the host cell used to express the prostate cancer 
protein. Numerous types of appropriate expression vectors, and suitable regulatory 
sequences are known in the art for a variety of host cells. 

15 In general, transcriptional and translational regulatory sequences may include, 

but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and 
stop sequences, translational start and stop sequences, and enhancer or activator sequences. 
In a preferred embodiment, the regulatory sequences include a promoter and transcriptional 
start and stop sequences. 

20 Promoter sequences encode either constitutive or inducible promoters. The 

promoters may be either naturally occurring promoters or hybrid promoters. Hybrid 
promoters, which combine elements of more than one promoter, are also known in the art, 
and are useful in the present invention. 

In addition, an expression vector may comprise additional elements. For 

25 example, the expression vector may have two replication systems, thus allowing it to be 
maintained in two organisms, e.g. in mammalian or insect cells for expression and in a 
procaryotic host for cloning and amplification. Furthermore, for integrating expression 
vectors, the expression vector contains at least one sequence homologous to the host cell 
genome, and preferably two homologous sequences which flank the expression construct. 

30 The integrating vector may be directed to a specific locus in the host cell by selecting the 
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appropriate homologous sequence for inclusion in the vector. Constructs for integrating 
vectors are well known in the art (e.g., Fernandez & Hoeffler, supra). 

In addition, in a preferred embodiment, the expression vector contains a 
selectable marker gene to allow the selection of transformed host cells. Selection genes are 
5 well known in the art and will vary with the host cell used. 

The prostate cancer proteins of the present invention are produced by culturing 
a host cell transformed with an expression vector containing nucleic acid encoding a prostate 
cancer protein, under the appropriate conditions to induce or cause expression of the prostate 
cancer protein. Conditions appropriate for prostate cancer protein expression will vary with 

10 the choice of the expression vector and the host cell, and will be easily ascertained by one 
skilled in the art through routine experimentation or optimization. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proliferation of the host cell, while the use of an inducible promoter requires the appropriate 
growth conditions for induction. In addition, in some embodiments, the timing of the harvest 

15 is important. For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect 
and animal cells, including mammalian cells. Of particular interest are Saccharomyces 
cerevisiae and other yeasts, E. coli, Bacillus subtilis, Sf9 cells, C129 cells, 293 cells, 

20 Neurospora, BHK, CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial 
cells), THP1 cells (a macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the prostate cancer proteins are expressed in 
mammalian cells. Mammalian expression systems are also known in the art, and include 
retroviral and adenoviral systems. One expression vector system is a retroviral vector system 

25 such as is generally described in PCT/US97/01019 and PCT/US97/01048, both of which are 
hereby expressly incorporated by reference. Of particular use as mammalian promoters are 
the promoters from mammalian viral genes, since the viral genes are often highly expressed 
and have a broad host range. Examples include the SV40 early promoter, mouse mammary 
tumor virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, 

30 and the CMV promoter (see, e.g., Fernandez & Hoeffler, supra). Typically, transcription 
termination and polyadenylation sequences recognized by mammalian cells are regulatory 
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regions located 3' to the translation stop codon and thus, together with the promoter elements, 
flank the coding sequence. Examples of transcription terminator and polyadenlyation signals 
include those derived form SV40. 

The methods of introducing exogenous nucleic acid into mammalian hosts, as 
5 well as other hosts, is well known in the art, and will vary with the host cell used. 
Techniques include dextran-mediated transfection, calcium phosphate precipitation, 
polybrene mediated transfection, protoplast fusion, electroporation, viral infection, 
encapsulation of the polynucleotide(s) in liposomes, and direct microinjection of the DNA 
into nuclei. 

10 In a preferred- embodiment, prostate cancer proteins are expressed in bacterial 

systems. Bacterial expression systems are well known in the art. Promoters from 
bacteriophage may also be used and are known in the art. In addition, synthetic promoters 
and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of the trp and lac 
promoter sequences. Furthermore, a bacterial promoter can include naturally occurring 

15 promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and 
initiate transcription. In addition to a functioning promoter sequence, an efficient ribosome 
binding site is desirable. The expression vector may also include a signal peptide sequence 
that provides for secretion of the prostate cancer protein in bacteria. The protein is either 
secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 

20 between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial 
expression vector may also include a selectable marker gene to allow for the selection of 
bacterial strains that have been transformed. Suitable selection genes include genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, 

25 such as those in the histidine, tryptophan and leucine biosynthetic pathways. These 

components are assembled into expression vectors. Expression vectors for bacteria are well 
known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, 
and Streptococcus lividans, among others (e.g., Fernandez & Hoeffler, supra). The bacterial 
expression vectors are transformed into bacterial host cells using techniques well known in 

30 the art, such as calcium chloride treatment, electroporation, and others. 
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In one embodiment, prostate cancer proteins are produced in insect cells. 
Expression vectors for the transformation of insect cells, and in particular, baculovirus-based 
expression vectors, are well known in the art. 

In a preferred embodiment, prostate cancer protein is produced in yeast cells. 

5 Yeast expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and C maltosa, Hansenula polymorpha, 
Kluyveromyces fragilis and K. lactis, Pichia guillerimondii and P. pastoris, 
Schizosaccharomyces pombe, and Yarrowia lipolytica. 

The prostate cancer protein may also be made as a fusion protein, using 

10 techniques well known in the art. Thus, e.g., for the creation of monoclonal antibodies, if the 
desired epitope is small, the prostate cancer protein may be fused to a carrier protein to form 
an immunogen. Alternatively, the prostate cancer protein may be made as a fusion protein to 
increase expression, or for other reasons. For example, when the prostate cancer protein is a 
prostate cancer peptide, the nucleic acid encoding the peptide may be linked to other nucleic 

15 acid for expression purposes. 

In a preferred embodiment, the prostate cancer protein is purified or isolated 
after expression. Prostate cancer proteins may be isolated or purified in a variety of ways 
known to those skilled in the art depending on what other components are present in the 
sample. Standard purification methods include electrophoretic, molecular, immunological 

20 and chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the prostate cancer 
protein may be purified using a standard anti-prostate cancer protein antibody column. 
Ultrafiltration and diafiltration techniques, in conjunction with protein concentration, are also 
useful. For general guidance in suitable purification techniques, see Scopes, Protein 

25 Purification (1982). The degree of purification necessary will vary depending on the use of 
the prostate cancer protein. In some instances no purification will be necessary. 

Once expressed and purified if necessary, the prostate cancer proteins and 
nucleic acids are useful in a number of applications. They may be used as immunoselection 
reagents, as vaccine reagents, as screening agents, etc. 

30 
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Variants of prostate cancer proteins 

In one embodiment, the prostate cancer proteins are derivative or variant 
prostate cancer proteins as compared to the wild-type sequence. That is, as outlined more 
fully below, the derivative prostate cancer peptide will often contain at least one amino acid 
5 substitution, deletion or insertion, with amino acid substitutions being particularly preferred. 
The amino acid substitution, insertion or deletion may occur at any residue within the 
prostate cancer peptide. 

Also included within one embodiment of prostate cancer proteins of the 
present invention are amino acid sequence variants. These variants typically fall into one or 

10 more of three classes: substitutional, insertional or deletional variants. These variants 

ordinarily are prepared by site specific mutagenesis of nucleotides in the DNA encoding the 
prostate cancer protein, using cassette or PCR mutagenesis or other techniques well known in 
the art, to produce DNA encoding the variant, and thereafter expressing the DNA in 
recombinant cell culture as outlined above. However, variant prostate cancer protein 

15 fragments having up to about 100-150 residues may be prepared by in vitro synthesis using 
established techniques. Amino acid sequence variants are characterized by the predetermined 
nature of the variation, a feature that sets them apart from naturally occurring allelic or 
interspecies variation of the prostate cancer protein amino acid sequence. The variants 
typically exhibit the same qualitative biological activity as the naturally occurring analogue, 

20 - although variants can also be selected which have modified characteristics as will be more 
fully outlined below. 

While the site or region for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 

25 conducted at the target codon or region and the expressed prostate cancer variants screened 
for the optimal combination of desired activity! Techniques for making substitution 
mutations at predetermined sites in DNA having a known sequence are well known, e.g., 
M13 primer mutagenesis and PCR mutagenesis. Screening of the mutants is done using 
assays of prostate cancer protein activities. 

30 Amino acid substitutions are typically of single residues; insertions usually 

will be on the order of from about 1 to 20 amino acids, although considerably larger 
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insertions may be tolerated. Deletions range from about 1 to about 20 residues, although in 
some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to 
arrive at a final derivative. Generally these changes are done on a few amino acids to 

5 minimize the alteration of the molecule. However, larger changes may be tolerated in certain 
circumstances. When small alterations in the characteristics of the prostate cancer protein are 
desired, substitutions are generally made in accordance with the amino acid substitution 
relationships provided in the definition section. 

The variants typically exhibit the same qualitative biological activity and will 

10 elicit the same immune response as the naturally-occurring analog, although variants also are 
selected to modify the characteristics of the prostate cancer proteins as needed. Alternatively, 
the variant may be designed such that the biological activity of the prostate cancer protein is 
altered. For example, glycosylation sites may be altered or removed. 

Substantial changes in function or immunological identity are made by 

15 selecting substitutions that are less conservative than those described above. For example, 
substitutions may be made which more significantly affect: the structure of the polypeptide 
backbone in the area of the alteration, for example the alpha-helical or beta-sheet structure; 
the charge or hydrophobicity of the molecule at the target site; or the bulk of the side chain. 
The substitutions which in general are expected to produce the greatest changes in the 

20 polypeptide's properties are those in which (a) a hydrophilic residue, e.g. seryl or threonyl is 
substituted for (or by) a hydrophobic residue, e.g. leucyl, isoleucyl, phenylalanyl, valyl or 
alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue 
having an electropositive side chain, e.g. lysyl, arginyl, or histidyl, is substituted for (or by) 
an electronegative residue, e.g. glutamyl or aspartyl; or (d) a residue having a bulky side 

25 chain, e.g. phenylalanine, is substituted for (or by) one not having a side chain, e.g. glycine. 

Covalent modifications of prostate cancer polypeptides are included within the 
scope of this invention. One type of covalent modification includes reacting targeted amino 
acid residues of a prostate cancer polypeptide with an organic derivatizing agent that is 
capable of reacting with selected side chains or the N-or C-terminal residues of a prostate 

30 cancer polypeptide. Derivatization with bifunctional agents is useful, for instance, for 

crosslinking prostate cancer polypeptides to a water-insoluble support matrix or surface for 
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use in the method for purifying anti-prostate cancer polypeptide antibodies or screening 
assays, as is more fully described below. Commonly used crosslinking agents include, e.g., 
l,l-bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., 
esters with 4-azidosalicylic acid, homobifunctional imidoesters, including disuccinimidyl 
5 esters such as 3,3'-dithiobis(succinimidylpropionate), Afunctional maleimides such as bis-N- 
maleimido-l,8-octane and agents such as methyl-3-((p-azidophenyl)dithio)propioimidate. 

Other modifications include deamidation of glutaminyl and asparaginyl 
residues to the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of 
proline and lysine, phosphorylation of hydroxyl groups of seryl, threonyl or tyrosyl residues, 

10 methylation of the amino groups of the lysine, arginine, and histidine side chains (Creighton, 
Proteins: Structure and Molecular Properties, pp. 79-86 (1983)), acetylation of the N- 
terminal amine, and amidation of any C-terminal carboxyl group. 

Another type of covalent modification of the prostate cancer polypeptide 
included within the scope of this invention comprises altering the native glycosylation pattern 

15 of the polypeptide. "Altering the native glycosylation pattern" is intended for purposes 

herein to mean deleting one or more carbohydrate moieties found in native sequence prostate 
cancer polypeptide, and/or adding one or more glycosylation sites that are not present in the 
native sequence prostate cancer polypeptide. Glycosylation patterns can be altered in many 
ways. For example the use of different cell types to express prostate cancer-associated 

20 sequences can result in different glycosylation patterns. 

Addition of glycosylation sites to prostate cancer polypeptides may also be 
accomplished by altering the amino acid sequence thereof. The alteration may be made, e.g., 
by the addition of, or substitution by, one or more serine or threonine residues to the native 
sequence prostate cancer polypeptide (for O-linked glycosylation sites). The prostate cancer 

25 amino acid sequence may optionally be altered through changes at the DNA level, 

particularly by mutating the DNA encoding the prostate cancer polypeptide at preselected 
bases such that codons are generated that will translate into the desired amino acids. 

Another means of increasing the number of carbohydrate moieties on the 
prostate cancer polypeptide is by chemical or enzymatic coupling of glycosides to the 

30 polypeptide. Such methods are described in the art, e.g., in WO 87/05330, and in Aplin & 
Wriston, CRC Crit. Rev. Biochem., pp. 259-306 (1981). 
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Removal of carbohydrate moieties present on the prostate cancer polypeptide 
may be accomplished chemically or enzymatically or by mutational substitution of codons 
encoding for amino acid residues that serve as targets for glycosylation. Chemical 
deglycosylation techniques are known in the art and described, for instance, by Hakimuddin, 
5 et al y Arch. Biochem. Biophys., 259:52 (1987) and by Edge et al.Anal Biochem., 118:131 
(1981). Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the 
use of a variety of endo-and exo-glycosidases as described by Thotakura et ai, Meth. 
EnzymoL, 138:350(1987). 

Another type of covalent modification of prostate cancer comprises linking the 

10 prostate cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., 

polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in 
U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192 or 4,179,337. 

Prostate cancer polypeptides of the present invention may also be modified in 
a way to form chimeric molecules comprising a prostate cancer polypeptide fused to another, 

15 heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 
molecule comprises a fusion of a prostate cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the prostate cancer polypeptide. The 
presence of such epitope-tagged forms of a prostate cancer polypeptide can be detected using 

20 an antibody against the tag polypeptide. Also, provision of the epitope tag enables the 
prostate cancer polypeptide to be readily purified by affinity purification using an anti-tag 
antibody or another type of affinity matrix that binds to the epitope tag. In an alternative 
embodiment, the chimeric molecule may comprise a fusion of a prostate cancer polypeptide 
with an immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of 

25 the chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

Various tag polypeptides and their respective antibodies are well known in the 
art. Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; 
HIS6 and metal chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field et 
aUMol. Cell Biol 8:2159-2165 (1988)); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 

30 9E10 antibodies thereto (Evan et al, Molecular and Cellular Biology 5:3610-3616 (1985)); 
and the Herpes Simplex virus glycoprotein D (gD) tag and its antibody (Paborsky et al, 
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Protein Engineering 3(6):547-553 (1990)). Other tag polypeptides include the Flag-peptide 
(Hopp etal, BioTechnology 6:1204-1210 (1988)); the KT3 epitope peptide (Martin et al t 
Science 255:192-194 (1992)); tubulin epitope peptide (Skinner et al f J. Biol Chem. 
266:15163-15166 (1991)); and the T7 gene 10 protein peptide tag (Lutz-Freyermuth et al, 
5 Proc. Natl Acad. ScL USA 87:6393-6397 (1990)). 

Also included are other prostate cancer proteins of the prostate cancer family, 
and prostate cancer proteins from other organisms, which are cloned and expressed as 
outlined below. Thus, probe or degenerate polymerase chain reaction (PCR) primer 
sequences may be used to find other related prostate cancer proteins from humans or other 

10 organisms. As will be appreciated by those in the art, particularly useful probe and/or PCR 
primer sequences include the unique areas of the prostate cancer nucleic acid sequence. As is 
generally known in the art, preferred PCR primers are from about 15 to about 35 nucleotides 
in length, with from about 20 to about 30 being preferred, and may contain inosine as needed. 
The conditions for the PCR reaction are well known in the art (e.g., Innis, PCR Protocols, 

15 supra). 

Antibodies to prostate cancer proteins 

In a preferred embodiment, when the prostate cancer protein is to be used to 
generate antibodies, e.g., for immunotherapy or immunodiagnosis, the prostate cancer protein 

20 should share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 
made to a smaller prostate cancer protein will be able to bind to the full-length protein, 
particularly linear epitopes. In a preferred embodiment; the epitope is unique; that is, 

25 antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are known to the skilled artisan 
(e.g., Coligan, supra; and Harlow & Lane, supra). Polyclonal antibodies can be raised in a 
mammal, e.g., by one or more injections of an immunizing agent and, if desired, an adjuvant. 
Typically, the immunizing agent and/or adjuvant will be injected in the mammal by multiple 

30 subcutaneous or intraperitoneal injections. The immunizing agent may include a protein 
encoded by a nucleic acid of the figures or fragment thereof or a fusion protein thereof. It 
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may be useful to conjugate the immunizing agent to a protein known to be immunogenic in 
the mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean 
trypsin inhibitor. Examples of adjuvants which may be employed include Freund's complete 
5 adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose 

dicorynomycolate). The immunization protocol may be selected by one skilled in the art 
without undue experimentation. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal 
antibodies may be prepared using hybridoma methods, such as those described by Kohler & 

10 Milstein, Nature 256:495 (1975). In a hybridoma method, a mouse, hamster, or other 
appropriate host animal, is typically immunized with an immunizing agent to elicit 
lymphocytes that produce or are capable of producing antibodies that will specifically bind to 
the immunizing agent. Alternatively, the lymphocytes may be immunized in vitro. The 
immunizing agent will typically include a polypeptide encoded by a nucleic acid of Tables 1- 

15 16 fragment thereof, or a fusion protein thereof. Generally, either peripheral blood 

lymphocytes ("PBLs") are used if cells of human origin are desired, or spleen cells or lymph 
node cells are used if non-human mammalian sources are desired. The lymphocytes are then 
fused with an immortalized cell line using a suitable fusing agent, such as polyethylene 
glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: Principles and Practice, 

20 pp. 59-103 (1986)). Immortalized cell lines are usually transformed mammalian cells, 
particularly myeloma cells of rodent, bovine and human origin. Usually, rat or mouse 
myeloma cell lines are employed. The hybridoma cells may be cultured in a suitable culture 
medium that preferably contains one or more substances that inhibit the growth or survival of 
the unfused, immortalized cells. For example, if the parental cells lack the enzyme 

25 hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium 
for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies. Bispecific 
antibodies are monoclonal, preferably human or humanized, antibodies that have binding 

30 specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 
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protein encoded by a nucleic acid Tables 1-16 or a fragment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 

5 In a preferred embodiment, the antibodies to prostate cancer protein are 

capable of reducing or eliminating a biological function of a prostate cancer protein, as is 
described below. That is, the addition of anti-prostate cancer protein antibodies (either 
polyclonal or preferably monoclonal) to prostate cancer tissue (or cells containing prostate 
cancer) may reduce or eliminate the prostate cancer. Generally, at least a 25% decrease in 
10 activity, growth, size or the like is preferred, with at least about 50% being particularly 
preferred and about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the prostate cancer proteins are 
humanized antibodies (e.g., Xenerex Biosciences, Mederex, Inc., Abgenix, Inc., Protein 
Design Labsjnc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric 
15 molecules of immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, 
Fab, Fab', F(ab')2 or other antigen-binding subsequences of antibodies) which contain 
minimal sequence derived from non-human immunoglobulin. Humanized antibodies include 
human immunoglobulins (recipient antibody) in which residues from a complementary 
determining region (CDR) of the recipient are replaced by residues from a CDR of a non- 
20 human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, 
affinity and capacity. In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
may also comprise residues which are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, a humanized antibody will comprise 
25 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework (FR) regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise 
at least a portion of an immunoglobulin constant region (Fc), typically that of a human 
30 immunoglobulin (Jones et al., Nature 321:522-525 (1986); Riechmann et al y Nature 

332:323-329 (1988); andPresta, Cum Op. Struct. Biol 2:593-596 (1992)). Humanization 
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can be essentially performed following the method of Winter and co-workers (Jones et al, 
Nature 321:522-525 (1986); Riechmann et al, Nature 332:323-327 (1988); Verhoeyen et al, 
Science 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. Accordingly, such humanized antibodies are 
5 chimeric antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact 
human variable domain has been substituted by the corresponding sequence from a non- 
human species. 

Human antibodies can also be produced using various techniques known in the 
art, including phage display libraries (Hoogenboom & Winter, J. Mol Biol 227:381 (1991); 

10 Marks et al, 7. Mol Biol 222:581 (1991)). The techniques of Cole et al and Boerner et al 
are also available for the preparation of human monoclonal antibodies (Cole et al, 
Monoclonal Antibodies and Cancer Tlxerapy, p. 77 (1985) and Boerner et al, /. Immunol 
147(l):86-95 (1991)). Similarly, human antibodies can be made by introducing of human 
immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 

15 immunoglobulin genes have been partially or completely inactivated. Upon challenge, 

human antibody production is observed, which closely resembles that seen in humans in all 
respects, including gene rearrangement, assembly, and antibody repertoire. This approach is 
described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 
5,661,016, and in the following scientific publications: Marks et al, BiofT echnology 10:779- 

20 783 (1992); Lonberg et al, Nature 368:856-859 (1994); Morrison, Nature 368:812-13 
(1994); Fishwild et al, Nature Biotechnology 14:845-51 (1996); Neuberger, Nature 
Biotechnology 14:826 (1996); Lonberg & Huszar, Intern. Rev. Immunol 13:65-93 (1995). 

By immunotherapy is meant treatment of prostate cancer with an antibody 
raised against prostate cancer proteins. As used herein, immunotherapy can be passive or 

25 active. Passive immunotherapy as defined herein is the passive transfer of antibody to a 

recipient (patient). Active immunization is the induction of antibody and/or T-cell responses 
in a recipient (patient). Induction of an immune response is the result of providing the 
recipient with an antigen to which antibodies are raised. As appreciated by one of ordinary 
skill in the art, the antigen may be provided by injecting a polypeptide against which 

30 antibodies are desired to be raised into a recipient, or contacting the recipient with a nucleic 
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acid capable of expressing the antigen and under conditions for expression of the antigen, 
leading to an immune response. 

In a preferred embodiment the prostate cancer proteins against which 
antibodies are raised are secreted proteins as described above. Without being bound by 
5 theory, antibodies used for treatment, bind and prevent the secreted protein from binding to 
its receptor, thereby inactivating the secreted prostate cancer protein. 

In another preferred embodiment, the prostate cancer protein to which 
antibodies are raised is a transmembrane protein. Without being bound by theory, antibodies 
used for treatment, bind the extracellular domain of the prostate cancer protein and prevent it 

10 from binding to other proteins, such as circulating ligands or cell-associated molecules. The 
antibody may cause down-regulation of the transmembrane prostate cancer protein. As will 
be appreciated by one of ordinary skill in the art, the antibody may be a competitive, non- 
competitive or uncompetitive inhibitor of protein binding to the extracellular domain of the 
prostate cancer protein. The antibody is also an antagonist of the prostate cancer protein. 

15 Further, the antibody prevents activation of the transmembrane prostate cancer protein. In 
one aspect, when the antibody prevents the binding of other molecules to the prostate cancer 
protein, the antibody prevents growth of the cell. The antibody may also be used to target or 
sensitize the cell to cytotoxic agents, including, but not limited to TNF-cc, TNF-P, IL-1, INF-y 
and IL-2, or chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, 

20 methotrexate, and the like. In some instances the antibody belongs to a sub-type that 
activates serum complement when complexed with the transmembrane protein thereby 
mediating cytotoxicity or antigen-dependent cytotoxicity (ADCC). Thus, prostate cancer is 
treated by administering to a patient antibodies directed against the transmembrane prostate 
cancer protein. Antibody-labeling may activate a co-toxin, localize a toxin payload, or 

25 otherwise provide means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector 
moiety. The effector moiety can be any number of molecules, including labelling moieties 
such as radioactive labels or fluorescent labels, or can be a therapeutic moiety. In one aspect 
the therapeutic moiety is a small molecule that modulates the activity of the prostate cancer 

30 protein. In another aspect the therapeutic moiety modulates the activity of molecules 

associated with or in close proximity to the prostate cancer protein. The therapeutic moiety 
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may inhibit enzymatic activity such as protease or collagenase or protein kinase activity 
associated with prostate cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic 
agent. In this method, targeting the cytotoxic agent to prostate cancer tissue or cells, results 
5 in a reduction in the number of afflicted cells, thereby reducing symptoms associated with 
prostate cancer. Cytotoxic agents are numerous and varied and include, but are not limited 
to, cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their 
corresponding fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A 
chain, curcin, crotin, phenomycin, enomycin and the like. Cytotoxic agents also include 

10 radiochemicals made by conjugating radioisotopes to antibodies raised against prostate 
cancer proteins, or binding of a radionuclide to a chelating agent that has been covalently 
attached to the antibody. Targeting the therapeutic moiety to transmembrane prostate cancer 
proteins not only serves to increase the local concentration of therapeutic moiety in the 
prostate cancer afflicted area, but also serves to reduce deleterious side effects that may be 

1 5 associated with the therapeutic moiety. 

In another preferred embodiment, the prostate cancer protein against which the 
antibodies are raised is an intracellular protein. In this case, the antibody may be conjugated 
to a protein which facilitates entry into the cell. In one case, the antibody enters the cell by 
endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 

20 the individual or cell. Moreover, wherein the prostate cancer protein can be targeted within a 
cell, i.e., the nucleus, an antibody thereto contains a signal for that target localization, i.e., a 
nuclear localization signal. 

The prostate cancer antibodies of the invention specifically bind to prostate 
cancer proteins. By "specifically bind" herein is meant that the antibodies bind to the protein 

25 with a Kd of at least about 0.1 mM, more usually at least about 1 \xM, preferably at least about 
0.1 nM or better, and most preferably, 0.01 \M or better. Selectivity of binding is also 
important. 

Detection of prostate cancer sequence for diagnostic and therapeutic applications 

30 In one aspect, the RNA expression levels of genes are determined for different 

cellular states in the prostate cancer phenotype. Expression levels of genes in normal tissue 
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(i.e., not undergoing prostate cancer) and in prostate cancer tissue (and in some cases, for 
varying severities of prostate cancer that relate to prognosis, as outlined below) are evaluated 
to provide expression profiles. An expression profile of a particular cell state or point of 
development is essentially a "fingerprint" of the state. While two states may have any 
5 particular gene similarly expressed, the evaluation of a number of genes simultaneously 

allows the generation of a gene expression profile that is reflective of the state of the cell. By 
comparing expression profiles of cells in different states, information regarding which genes 
are important (including both up- and down-regulation of genes) in each of these states is 
obtained. Then, diagnosis may be performed or confirmed to determine whether a tissue 

10 sample has the gene expression profile of normal or cancerous tissue. This will provide for 
molecular diagnosis of related conditions. 

"Differential expression," or grammatical equivalents as used herein, refers to 
qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 

15 qualitatively have its expression altered, including an activation or inactivation, in, e.g., 
normal versus prostate cancer tissue. Genes may be turned on or turned off in a particular 
state, relative to another state thus permitting comparison of two or more states. A 
qualitatively regulated gene will exhibit an expression pattern within a state or cell type 
which is detectable by standard techniques. Some genes will be expressed in one state or cell 

20 type, but not in both. Alternatively, the difference in expression may be quantitative, e.g., in 
that expression is increased or decreased; i.e., gene expression is either upregulated, resulting 
in an increased amount of transcript, or downregulated, resulting in a decreased amount of 
transcript. The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Affymetrix 

25 GeneChip™ expression arrays, Lockhart, Nature Biotechnology 14:1675-1680 (1996), 

hereby expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PGR, northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is at least 
about 50%, more preferably at least about 100%, more preferably at least about 150%, more 

30 preferably at least about 200%, with from 300 to at least 1000% being especially preferred. 
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Evaluation may be at the gene transcript, or the protein level. The amount of 
gene expression may be monitored using nucleic acid probes to the DNA or RNA equivalent 
of the gene transcript, and the quantification of gene expression levels, or, alternatively, the 
final gene product itself (protein) can be monitored, e.g., with antibodies to the prostate 
5 cancer protein and standard immunoassays (ELISAs, etc.) or other techniques, including 
mass spectroscopy assays, 2D gel electrophoresis assays, etc. Proteins corresponding to 
prostate cancer genes, i.e., those identified as being important in a prostate cancer phenotype, 
can be evaluated in a prostate cancer diagnostic test. 

In a preferred embodiment, gene expression monitoring is performed 

10 simultaneously on a number of genes. Multiple protein expression monitoring can be 

performed as well. Similarly, these assays may be performed on an individual basis as well. 

In this embodiment, the prostate cancer nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of prostate cancer sequences 
in a particular cell. The assays are further described below in the example. PCR techniques 

15 can be used to provide greater sensitivity. 

In a preferred embodiment nucleic acids encoding the prostate cancer protein 
are detected. Although DNA or RNA encoding the prostate cancer protein may be detected, 
of particular interest are methods wherein an mRNA encoding a prostate cancer protein is 
detected. Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is 

20 complementary to and hybridizes with the mRNA and includes, but is not limited to, 

oligonucleotides, cDNA or RNA. Probes also should contain a detectable label, as defined 
herein. In one method the mRNA is detected after immobilizing the nucleic acid to be 
examined on a solid support such as nylon membranes and hybridizing the probe with the 
sample. Following washing to remove the non-specific&lly bound probe, the label is 

25 detected. In another method detection of the mRNA is performed in situ. In this method 
permeabilized cells or tissue samples are contacted with a detectably labeled nucleic acid 
probe for sufficient time to allow the probe to hybridize with the target mRNA. Following 
washing to remove the non-specifically bound probe, the label is detected. For example a 
digoxygenin labeled riboprobe (RNA probe) that is complementary to the mRNA encoding a 

30 prostate cancer protein is detected by binding the digoxygenin with an anti-digoxygenin 
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secondary antibody and developed with nitro blue tetrazolium and 5-bromo-4-chloro-3- 
indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins 
as described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
5 assays. The prostate cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing prostate cancer sequences are used in diagnostic assays. This can be performed on 
an individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes and/or corresponding 
10 polypeptides. 

As described and defined herein, prostate cancer proteins, including 
intracellular, transmembrane or secreted proteins, find use as markers of prostate cancer. 
Detection of these proteins in putative prostate cancer tissue allows for detection or diagnosis 
of prostate cancer. In one embodiment, antibodies are used to detect prostate cancer proteins. 

15 A preferred method separates proteins from a sample by electrophoresis on a gel (typically a 
denaturing and reducing protein gel, but may be another type of gel, including isoelectric 
focusing gels and the like). Following separation of proteins, the prostate cancer protein is 
detected, e.g., by immunoblotting with antibodies raised against the prostate cancer protein. 
Methods of immunoblotting are well known to those of ordinary skill in the art. 

20 In another preferred method, antibodies to the prostate cancer protein find use 

in in situ imaging techniques, e.g., in histology (e.g., Methods in Cell Biology: Antibodies in 
Cell Biology, volume 37 (Asai, ed. 1993)). In this method cells are contacted with from one 
to many antibodies to the prostate cancer protein(s). Following washing to remove non- 
specific antibody binding, the presence of the antibody or antibodies is detected. In one 

25 embodiment the antibody is detected by incubating with a secondary antibody that contains a 
detectable label. In another method the primary antibody to the prostate cancer protein(s) 
contains a detectable label, e.g. an enzyme marker that can act on a substrate. In another 
preferred embodiment each one of multiple primary antibodies contains a distinct and 
detectable label. This method finds particular use in simultaneous screening for a plurality of 

30 prostate cancer proteins. As will be appreciated by one of ordinary skill in the art, many 
other histological imaging techniques are also provided by the invention. 
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In a preferred embodiment the label is detected in a fluorometer which has the 
ability to detect and distinguish emissions of different wavelengths. In addition, a 
fluorescence activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing prostate 
5 cancer from blood, serum, plasma, stool, and other samples. Such samples, therefore, are 
useful as samples to be probed or tested for the presence of prostate cancer proteins. 
Antibodies can be used to detect a prostate cancer protein by previously described 
immunoassay techniques including ELISA, immunoblotting (western blotting), 
immunoprecipitation, BIACORE technology and the like. Conversely, the presence of 
10 antibodies may indicate an immune response against an endogenous prostate cancer protein. 

In a preferred embodiment, in situ hybridization of labeled prostate cancer 
nucleic acid probes to tissue arrays is done. For example, arrays of tissue samples, including 
prostate cancer tissue and/or normal tissue, are made. In situ hybridization (see, e.g., 
Ausubel, supra) is then performed. When comparing the fingerprints between an individual 
15 and a standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on 
the findings. It is further understood that the genes which indicate the diagnosis may differ 
from those which indicate the prognosis and molecular profiling of the condition of the cells 
may lead to distinctions between responsive or refractory conditions or may be predictive of 
outcomes. 

20 In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic 

acids, modified proteins and cells containing prostate cancer sequences are used in prognosis 
assays. As above, gene expression profiles can be generated that correlate to prostate cancer, 
in terms of long term prognosis. Again, this may be done on either a protein or gene level, 
with the use of genes being preferred. As above, prostate cancer probes may be attached to 

25 biochips for the detection and quantification of prostate cancer sequences in a tissue or 

patient. The assays proceed as outlined above for diagnosis. PCR method may provide more 
sensitive and accurate quantification. 

Assays for therapeutic compounds 
30 In a preferred embodiment members of the proteins, nucleic acids, and 

antibodies as described herein are used in drug screening assays. The prostate cancer 
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proteins, antibodies, nucleic acids, modified proteins and cells containing prostate cancer 
sequences are used in drug screening assays or by evaluating the effect of drug candidates on 
a "gene expression profile" or expression profile of polypeptides. In a preferred embodiment, 
the expression profiles are used, preferably in conjunction with high throughput screening 
5 techniques to allow monitoring for expression profile genes after treatment with a candidate 
agent (e.g., Zlokarnik, et al y Science 279:84-8 (1998); Heid, Genome Res 6:986-94, 1996). 

In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic 
acids, modified proteins and cells containing the native or modified prostate cancer proteins 
are used in screening assays. That is, the present invention provides novel methods for 

10 screening for compositions which modulate the prostate cancer phenotype or an identified 
physiological function of a prostate cancer protein. As above, this can be done on an 
individual gene level or by evaluating the effect of drug candidates on a "gene expression 
profile". In a preferred embodiment, the expression profiles are used, preferably in 
conjunction with high throughput screening techniques to allow monitoring for expression 

1 5 profile genes after treatment with a candidate agent, see Zlokarnik, supra. 

Having identified the differentially expressed genes herein, a variety of assays 
may be executed. In a preferred embodiment, assays may be run on an individual gene or 
protein level. That is, having identified a particular gene as up regulated in prostate cancer, 
test compounds can be screened for the ability to modulate gene expression or for binding to 

20 the prostate cancer protein. "Modulation" thus includes both an increase and a decrease in 
gene expression. The preferred amount of modulation will depend on the original change of 
the gene expression in normal versus tissue undergoing prostate cancer, with changes of at 
least 10%, preferably 50%, more preferably 100-300%, and in some embodiments 300- 
1000% or greater. Thus, if a gene exhibits a 4-fold increase in prostate cancer tissue 

25 compared to normal tissue, a decrease of about four-fold is often desired; similarly, a 10-fold 
decrease in prostate cancer tissue compared to normal tissue often provides a target value of a 
10-fold increase in expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes 
and the quantification of gene expression levels, or, alternatively, the gene product itself can 

30 be monitored, e.g., through the use of antibodies to the prostate cancer protein and standard 
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immunoassays. Proteomics and separation techniques may also allow quantification of 
expression. . 

In a preferred embodiment, gene expression or protein monitoring of a number 
of entities, i.e., an expression profile, is monitored simultaneously. Such profiles will 
5 typically involve a plurality of those entities described herein.. 

In this embodiment, the prostate cancer nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of prostate cancer sequences 
in a particular cell. Alternatively, PCR may be used. Thus, a series, e.g., of microtiter plate, 
may be used with dispensed primers in desired wells. A PCR reaction can then be performed 
10 . and analyzed for each well. 

Expression monitoring can be performed to identify compounds that modify 
the expression of one or more prostate cancer-associated sequences, e.g., a polynucleotide 
sequence set out in Tables 1-16. Generally, in a preferred embodiment, a test modulator is 
added to the cells prior to analysis. Moreover, screens are also provided to identify agents 
15 that modulate prostate cancer, modulate prostate cancer proteins, bind to a prostate cancer 
protein, or interfere with the binding of a prostate cancer protein and an antibody or other 
binding partner. 

The term "test compound" or "drug candidate" or "modulator" or grammatical 
equivalents as used herein describes any molecule, e.g., protein, oligopeptide, small organic 

20 molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or 

indirectly alter the prostate cancer phenotype or the expression of a prostate cancer sequence, 
e.g., a nucleic acid or protein sequence. In preferred embodiments, modulators alter 
expression profiles, or expression profile nucleic acids or proteins provided herein. In one 
embodiment, the modulator suppresses a prostate cancer phenotype, e.g. to a normal tissue 

25 fingerprint. In another embodiment, a modulator induced a prostate cancer phenotype. 

Generally, a plurality of assay mixtures are run in parallel with different agent concentrations 
to obtain a differential response to the various concentrations. Typically, one of these 
concentrations serves as a negative control, i.e., at zero concentration or below the level of 
detection. 

30 Drug candidates encompass numerous chemical classes, though typically they 

are organic molecules, preferably small organic compounds having a molecular weight of 
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more than 100 and less than about 2,500 daltons. Preferred small molecules are less than 
2000, or less than 1500 or less than 1000 or less than 500 D. Candidate agents comprise 
functional groups necessary for structural interaction with proteins, particularly hydrogen 
bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, 
5 preferably at least two of the functional chemical groups. The candidate agents often 
comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic 
structures substituted with one or more of the above functional groups. Candidate agents are 
also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, 
pyrimidines, derivatives, structural analogs or combinations thereof. Particularly preferred 
10 are peptides. 

In one aspect, a modulator will neutralize the effect of a prostate cancer 
protein. By "neutralize" is meant that activity of a protein is inhibited or blocked and the 
consequent effect on the cell. 

In certain embodiments, combinatorial libraries of potential modulators will be 
15 screened for an ability to bind to a prostate cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
and activity of those variant compounds. Often, high throughput screening (HTS) methods 
20 are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve 
providing a library containing a large number of potential therapeutic compounds (candidate 
compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 
25 display a desired characteristic activity. The compounds thus identified can serve as 

conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 

A combinatorial chemical library is a collection of diverse chemical 
compounds generated by either chemical synthesis or biological synthesis by combining a 
number of chemical "building blocks" such as reagents. For example, a linear combinatorial 
30 chemical library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of 
chemical building blocks called amino acids in every possible way for a given compound 
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length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical 
compounds can be synthesized through such combinatorial mixing of chemical building 
blocks (Gallop et al, J. Med. Chem. 37(9):1233-1251 (1994)). 

Preparation and screening of combinatorial chemical libraries is well known to 
5 those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries {see, e.g., U.S. Patent No. 5,010,175, Furka, Pept. Prot. Res. 37:487-493 
(1991), Houghton et al, Nature, 354:84-88 (1991)), peptoids (PCT Publication No WO 
91/19735), encoded peptides (PCT Publication WO 93/20242), random bio-oligomers (PCT 
Publication WO 92/00091), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as 

10 hydantoins, benzodiazepines and dipeptides (Hobbs et al, Proc. Nat. Acad. Sci. USA 
90:6909-6913 (1993)), vinylogous polypeptides (Hagihara et al,J. Amer. Chem. Soc. 
114:6568 (1992)), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding 
(Hirschmann et al, J. Anier. Chem. Soc. 114:9217-9218 (1992)), analogous organic syntheses 
of small compound libraries (Chen et al, J. Amer. Chem. Soc. 116:2661 (1994)), 

15 oligocarbamates (Cho, et al, Science 261:1303 (1993)), and/or peptidyl phosphonates 

(Campbell et al, J. Org. Chem. 59:658 (1994)). See, generally, Gordon et al, J. Med. Chem. 
37:1385 (1994), nucleic acid libraries {see, e.g., Strategene, Corp.), peptide nucleic acid 
libraries {see, e.g., U.S. Patent 5,539,083), antibody libraries {see, e.g., Vaughn et al, Nature 
Biotechnology 14(3):309-314 (1996), and PCT/US96/10287), carbohydrate libraries {see, 

20 e.g., Liang et al, Science 274:1520-1522 (1996), and U.S. Patent No. 5,593,853), and small 
organic molecule libraries {see, e.g., benzodiazepines, Baum, C&EN, Jan 18, page 33 (1993); 
isoprenoids, U.S. Patent No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent No. 
5,549,974; pyrrolidines, U.S. Patent Nos. 5,525,735 and 5,519,134; morpholino compounds, 
U.S. Patent No. 5,506,337; benzodiazepines, U.S. Patent No. 5,288,514; and the like). 

25 Devices for the preparation of combinatorial libraries are commercially 

available {see, e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, 
Rainin, Woburn, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, 
Bedford, MA). 

A number of well known robotic systems have also been developed for 
30 solution phase chemistries. These systems include automated workstations like the 

automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, 
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Japan) and many robotic systems utilizing robotic arms (Zymate II, Zymark Corporation, 
Hopkinton, Mass.; Orca, Hewlett-Packard, Palo Alto, Calif.), which mimic the manual 
synthetic operations performed by a chemist. Any of the above devices are suitable for use 
with the present invention. The nature and implementation of modifications to these devices 
5 (if any) so that they can operate as discussed herein will be apparent to persons skilled in the 
relevant art. In addition, numerous combinatorial libraries are themselves commercially 
available {see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, Inc., St. Louis, 
MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, 
Columbia, MD, etc.). 

10 The assays to identify modulators are amenable to high throughput screening. 

Preferred assays thus detect enhancement or inhibition of prostate cancer gene transcription, 
inhibition or enhancement of polypeptide expression, and inhibition or enhancement of 
polypeptide activity. 

High throughput assays for the presence, absence, quantification, or other 

15 properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high throughput screening methods for proteins, 
U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic acid 
binding (i.e., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 

20 throughput methods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available 
(see, e.g., Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 
typically automate entire procedures, including all sample and reagent pipetting, liquid 

25 dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate 
for the assay. These configurable systems provide high throughput and rapid start up as well 
as a high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 

30 transcription, ligand binding, and the like. 
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In one embodiment, modulators are proteins, often naturally occurring 
proteins or fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing 
proteins, or random or directed digests of proteinaceous cellular extracts, may be used. In 
this way libraries of proteins may be made for screening in the methods of the invention. 
5 Particularly preferred in this embodiment are libraries of bacterial, fungal, viral, and 

mammalian proteins, with the latter being preferred, and human proteins being especially 
preferred. Particularly useful test compound will be directed to the class of proteins to which 
the target belongs, e.g., substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 

10 30 amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 
to about 15 being particularly preferred. The peptides may be digests of naturally occurring 
proteins as is outlined above, random peptides, or "biased" random peptides. By 
"randomized" or grammatical equivalents herein is meant that each nucleic acid and peptide 
consists of essentially random nucleotides and amino acids, respectively. Since generally 

15 these random peptides (or nucleic acids, discussed below) are chemically synthesized, they 
may incorporate any nucleotide or amino acid at any position. The synthetic process can be 
designed to generate randomized proteins or nucleic acids, to allow the formation of all or 
most of the possible combinations over the length of the sequence, thus forming a library of 
randomized candidate bioactive proteinaceous agents. 

20 In one embodiment, the library is fully randomized, with no sequence 

preferences or constants at any position. In a preferred embodiment, the library is biased. 
That is, some positions within the sequence are either held constant, or are selected from a 
limited number of possibilities. For example, in a preferred embodiment, the nucleotides or 
amino acid residues are randomized within a defined class, e.g., of hydrophobic amino acids, 

25 hydrophilic residues, sterically biased (either small or large) residues, towards the creation of 
nucleic acid binding domains, the creation of cysteines, for cross-linking, prolines for SH-3 
domains, serines, threonines, tyrosines or histidines for phosphorylation sites, etc., or to 
purines, etc. 

Modulators of prostate cancer can also be nucleic acids, as defined below. As 
30 described above generally for proteins, nucleic acid modulating agents may be naturally 
occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. For 
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example, digests of procaryotic or eucaryotic genomes may be used as is outlined above for 
proteins. 

In certain embodiments, the activity of a prostate cancer-associated protein is 
down-regulated, or entirely inhibited, by the use of antisense polynucleotide, t.e., a nucleic 
5 acid complementary to, and which can preferably hybridize specifically to, a coding mRNA 
nucleic acid sequence, e.g., a prostate cancer protein mRNA, or a subsequence thereof. 
Binding of the antisense polynucleotide to the mRNA reduces the translation and/or stability 
of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise 

10 naturally-occurring nucleotides, or synthetic species formed from naturally-occurring 
subunits or their close homologs. Antisense polynucleotides may also have altered sugar 
moieties or inter-sugar linkages. Exemplary among these are the phosphorothioate and other 
sulfur containing species which are known for use in the art. Analogs are comprehended by 
this invention so long as they function effectively to hybridize with the prostate cancer 

15 protein mRNA. See, e.g., Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

Such antisense polynucleotides can readily be synthesized using recombinant 
means, or can be synthesized in vitro. Equipment for such synthesis is sold by several 
vendors, including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 

20 Antisense molecules as used herein include antisense or sense 

oligonucleotides. Sense oligonucleotides can, e.g., be employed to block transcription by 
binding to the anti-sense strand. The antisense and sense oligonucleotide comprise a single- 
stranded nucleic acid sequence (either RNA or DNA) capable of binding to target mRNA 
(sense) or DNA (antisense) sequences for prostate cancer molecules. Antisense or sense 

25 oligonucleotides, according to the present invention, comprise a fragment generally at least 
about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein 
is described in, e.g., Stein & Cohen (Cancer Res. 48:2659 (1988 and van der Krol et al 
(BioTechniques 6:958 (1988)). 

30 In addition to antisense polynucleotides, ribozymes can be used to target and 

inhibit transcription of prostate cancer-associated nucleotide sequences. A ribozyme is an 
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RNA molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes 
have been described, including group I ribozymes, hammerhead ribozymes, hairpin 
ribozymes, RNase P, and axhead ribozymes {see, e.g., Castanotto et al. y Adv. in 
Phartnacology 25: 289-317 (1994) for a general review of the properties of different 
5 ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel et al, 
Nucl. Acids Res. 18:299-304 (1990); European Patent Publication No. 0 360 257; U.S. Patent 
No. 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., 
WO 94/26877; Ojwang et al y Proc. Natl Acad. ScL USA 90:6340-6344 (1993); Yamada et 

10 a/., Hwnan Gene Tlierapy 1:39-45 (1994); Leavitt et al, Proc. Natl. Acad. Sci. USA 92:699- 
703 (1995); Leavitt et al, Human Gene Tlierapy 5:1151-120 (1994); and Yamada et al, 
Virology 205: 121-126 (1994)). 

Polynucleotide modulators of prostate cancer may be introduced into a cell 
containing the target nucleotide sequence by formation of a conjugate with a ligand binding 

15 molecule, as described in WO 91/04753. Suitable ligand binding molecules include, but are 
not limited to, cell surface receptors, growth factors, other cytokines, or other ligands that 
bind to cell surface receptors. Preferably, conjugation of the ligand binding molecule does 
not substantially interfere with the ability of the ligand binding molecule to bind to its 
coiresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 

20 or its conjugated version into the cell. Alternatively, a polynucleotide modulator of prostate 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 
understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in addition to methods of treatment. 

25 As noted above, gene expression monitoring is conveniently used to test 

candidate modultors (e.g., protein, nucleic acid or small molecule). After the candidate agent 
has been added and the cells allowed to incubate for some period of time, the sample 
containing a target sequence to be analyzed is added to the biochip. If required, the target 
sequence is prepared using known techniques. For example, the sample may be treated to 

30 lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 

amplification such as PCR performed as appropriate. For example, an in vitro transcription 
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with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids 
are labeled with biotin-FITC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a 
fluorescent, a chemiluminescent, a chemical, or a radioactive signal, to provide a means of 
5 detecting the target sequence's specific binding to a probe. The label also can be an enzyme, 
such as, alkaline phosphatase or horseradish peroxidase, which when provided with an 
appropriate substrate produces a product that can be detected. Alternatively, the label can be 
a labeled compound or small molecule, such as an enzyme inhibitor, that binds but is not 
catalyzed or altered by the enzyme. The label also can be a moiety or compound, such as, an 

10 epitope tag or biotin which specifically binds to streptavidin. For the example of biotin, the 
streptavidin is labeled as described above, thereby, providing a detectable signal for the 
bound target sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

As will be appreciated by those in the art, these assays can be direct 
hybridization assays or can comprise "sandwich assays", which include the use of multiple 

15 probes, as is generally outlined in U.S. Patent Nos. 5,681,702, 5,597,909, 5,545,730, 

5.594.117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 

5.594.118, 5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by 
reference. In this embodiment, in general, the target nucleic acid is prepared as outlined 
above, and then added to the biochip comprising a plurality of nucleic acid probes, under 

20 conditions that allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, 
including high, moderate and low stringency conditions as outlined above. The assays are 
generally run under stringency conditions which allows formation of the label probe 
hybridization complex only in the presence of target. Stringency can be controlled by 

25 altering a step parameter that is a thermodynamic variable, including, but not limited to, 

temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, 
organic solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is 
generally outlined in U.S. Patent No. 5,681,697. Thus it may be desirable to perform certain 

30 steps at higher stringency conditions to reduce non-specific binding. 
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The reactions outlined herein may be accomplished in a variety of ways. 
Components of the reaction may be added simultaneously, or sequentially, in different orders, 
with preferred embodiments outlined below. In addition, the reaction may include a variety 
of other reagents. These include salts, buffers, neutral proteins, e.g. albumin, detergents, etc. 
5 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target. 

The assay data are analyzed to determine the expression levels, and changes in 

10 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the prostate cancer 
phenotype. In one embodiment, screening is performed to identify modulators that can 
induce or suppress a particular expression profile, thus preferably generating the associated 
phenotype. In another embodiment, e.g., for diagnostic applications, having identified 

15 differentially expressed genes important in a particular state, screens can be performed to 
identify modulators that alter expression of individual genes. In an another embodiment, 
screening is performed to identify modulators that alter a biological function of the 
expression product of a differentially expressed gene. Again, having identified the 
importance of a gene in a particular state, screens are performed to identify agents that bind 

20 and/or modulate the biological activity of the gene product. 

In addition screens can be done for genes that are induced in response to a 
candidate agent. After identifying a modulator based upon its ability to suppress a prostate 
cancer expression pattern leading to a normal expression pattern, or to modulate a single 
prostate cancer gene expression profile so as to mimic the expression of the gene from 

25 normal tissue, a screen as described above can be performed to identify genes that are 
specifically modulated in response to the agent. Comparing expression profiles between 
normal tissue and agent treated prostate cancer tissue reveals genes that are not expressed in 
normal tissue or prostate cancer tissue, but are expressed in agent treated tissue. These agent- 
specific sequences can be identified and used by methods described herein for prostate cancer 

30 genes or proteins. In particular these sequences and the proteins they encode find use in 
marking or identifying agent treated cells. In addition, antibodies can be raised against the 
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agent induced proteins and used to target novel therapeutics to the treated prostate cancer 
tissue sample. 

Thus, in one embodiment, a test compound is administered to a population of 
prostate cancer cells, that have an associated prostate cancer expression profile. By 
5 "administration" or "contacting" herein is meant that the candidate agent is added to the cells 
in such a manner as to allow the agent to act upon the cell, whether by uptake and 
intracellular action, or by action at the cell surface. In some embodiments, nucleic acid 
encoding a proteinaceous candidate agent (i.e., a peptide) may be put into a viral construct 
such as an adenoviral or retroviral construct, and added to the cell, such that expression of 
10 the peptide agent is accomplished, e.g., PCT US97/G1019. Regulatable gene therapy systems 
can also be used. 

Once the test compound has been administered to the cells, the cells can be 
washed if desired and are allowed to incubate under preferably physiological conditions for 
some period of time. The cells are then harvested and a new gene expression profile is 
15 generated, as outlined herein. 

Thus, e.g., prostate cancer tissue may be screened for agents that modulate, 
e.g., induce or suppress the prostate cancer phenotype. A change in at least one gene, 
preferably many, of the expression profile indicates that the agent has an effect on prostate 
cancer activity. By defining such a signature for the prostate cancer phenotype, screens for 
20 new drugs that alter the phenotype can be devised. With this approach, the drug target need 
not be known and need not be represented in the original expression screening platform, nor 
does the level of transcript for the target protein need to change. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 
25 differentially expressed gene as important in a particular state, screening of modulators of 

either the expression of the gene or the gene product itself can be done. The gene products of 
differentially expressed genes are sometimes referred to herein as "prostate cancer proteins" 
or a "prostate cancer modulatory protein". The prostate cancer modulatory protein may be a 
fragment, or alternatively, be the full length protein to the fragment encoded by the nucleic 
. 30 acids of Tables 1-16. Preferably, the prostate cancer modulatory protein is a fragment. In a 
preferred embodiment, the prostate cancer amino acid sequence which is used to determine 
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sequence identity or similarity is encoded by a nucleic acid of Tables 1-16. In another 
embodiment, the sequences are naturally occurring allelic variants of a protein encoded by a 
nucleic acid of Tables 1-16. In another embodiment, the sequences are sequence variants as 
further described herein. 
5 Preferably, the prostate cancer modulatory protein is a fragment of 

approximately 14 to 24 amino acids long. More preferably the fragment is a soluble 
fragment. Preferably, the fragment includes a non-transmembrane region. In a preferred 
embodiment, the fragment has an N-terminal Cys to aid in solubility. In one embodiment, the 
C-terminus of the fragment is kept as a free acid and the N-terminus is a free amine to aid in 

10 coupling, i.e., to cysteine. 

In one embodiment the prostate cancer proteins are conjugated to an 
immunogenic agent as discussed herein. In one embodiment the prostate cancer protein is 
conjugated to BSA. 

Measurements of prostate cancer polypeptide activity, or of prostate cancer or 

15 the prostate cancer phenotype can be performed using a variety of assays. For example, the 
effects of the test compounds upon the function of the prostate cancer polypeptides can be 
measured by examining parameters described above. A suitable physiological change that 
affects activity can be used to assess the influence of a test compound on the polypeptides of 
this invention. When the functional consequences are determined using intact cells or 

20 animals, one can also measure a variety of effects such as, in the case of prostate cancer 
associated with tumors, tumor growth, tumor metastasis, neovascularization, hormone 
release, transcriptional changes to both known and uncharacterized genetic markers (e.g., 
northern blots), changes in cell metabolism such as cell growth orpH changes, and changes 
in intracellular second messengers such as cGMP. In the assays of the invention, mammalian 

25 prostate cancer polypeptide is typically used, e.g., mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in 
vitro. For example, a prostate cancer polypeptide is first contacted with a potential modulator 
and incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. In one embodiment, 
the prostate cancer polypeptide levels are determined in vitro by measuring the level of 

30 protein or mRNA. The level of protein is measured using immunoassays such as western 
blotting, ELISA and the like with an antibody that selectively binds to the prostate cancer 
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polypeptide or a fragment thereof. For measurement of mRNA, amplification, e.g., using 
PCR, LCR, or hybridization assays, e.g., northern hybridization, RNAse protection, dot 
blotting, are preferred. The level of protein or mRNA is detected using directly or indirectly 
labeled detection agents, e.g., fluorescently or radioactively labeled nucleic acids, 
5 radioactively or enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using the prostate cancer 
protein promoter operably linked to a reporter gene such as luciferase, green fluorescent 
protein, CAT, or P-gal. The reporter construct is typically transfected into a cell. After 
treatment with a potential modulator, the amount of reporter gene transcription, translation, or 
10 activity is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 
differentially expressed gene as important in a particular state, screening of modulators of the 
expression of the gene or the gene product itself can be done. The gene products of 
15 differentially expressed genes are sometimes referred to herein as "prostate cancer proteins." 
The prostate cancer protein may be a fragment, or alternatively, be the full length protein to a 
fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes 
is performed. Typically, the expression of only one or a few genes are evaluated. In another 
20 embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or 
25 isolated gene product is used; that is, the gene products of one or more differentially 

expressed nucleic acids are made. For example, antibodies are generated to the protein gene 
products, and standard immunoassays are run to determine the amount of protein present. 
Alternatively, cells comprising the prostate cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a prostate 
30 cancer protein and a candidate compound, and determining the binding of the compound to 
the prostate cancer protein. Preferred embodiments utilize the human prostate cancer protein, 
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although other mammalian proteins may also be used, e.g. for the development of animal 
models of human disease. In some embodiments, as outlined herein, variant or derivative 
prostate cancer proteins may be used. 

Generally, in a preferred embodiment of the methods herein, the prostate 

5 cancer protein or the candidate agent is non-diffusably bound to an insoluble support having 
isolated sample receiving areas (e.g. a microtiter plate, an array, etc.). The insoluble 
supports may be made of any composition to which the compositions can be bound, is readily 
separated from soluble material, and is otherwise compatible with the overall method of 
screening. The surface of such supports may be solid or porous and of any convenient shape. 

10 Examples of suitable insoluble supports include microtiter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 
and samples. The particular manner of binding of the composition is not crucial so long as it 

15 is compatible with the reagents and overall methods of the invention, maintains the activity of 
the composition and is nondiffusable. Preferred methods of binding include the use of 
antibodies (which do not sterically block either the ligand binding site or activation sequence 
when the protein is bound to the support), direct binding to "sticky" or ionic supports, 
chemical crosslinking, the synthesis of the protein or agent on the surface, etc. Following 

20 binding of the protein or agent, excess unbound material is removed by washing. The sample 
receiving areas may then be blocked through incubation with bovine serum albumin (BSA), 
casein or other innocuous protein or other moiety. 

In a preferred embodiment, the prostate cancer protein is bound to the support, 
and a test compound is added to the assay. Alternatively, the candidate agent is bound to the 

25 support and the prostate cancer protein is added. Novel binding agents include specific 
antibodies, non-natural binding agents identified in screens of chemical libraries, peptide 
analogs, etc. Of particular interest are screening assays for agents that have a low toxicity for 
human cells. A wide variety of assays may be used for this purpose, including labeled in 
vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays for 

30 protein binding, functional assays (phosphorylation assays, etc.) and the like. 
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The determination of the binding of the test modulating compound to the 
prostate cancer protein may be done in a number of ways. In a preferred embodiment, the 
compound is labeled, and binding determined directly, e.g., by attaching all or a portion of 
the prostate cancer protein to a solid support, adding a labeled candidate agent (e.g., a 
5 fluorescent label), washing off excess reagent, and determining whether the label is present 
on the solid support. Various blocking and washing steps may be utilized as appropriate. 

In some embodiments, only one of the components is labeled, e.g., the 
proteins (or proteinaceous candidate compounds) can be labeled. Alternatively, more than 
one component can be labeled with different labels, e.g., 125 I for the proteins and a fluorophor 
10 for the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

In one embodiment, the binding of the test compound is determined by 
competitive binding assay. The competitor is a binding moiety known to bind to the target 
molecule (i.e., a prostate cancer protein), such as an antibody, peptide, binding partner, 

15 ligand, etc. Under certain circumstances, there may be competitive binding between the 

compound and the binding moiety, with the binding moiety displacing the compound. In one 
embodiment, the test compound is labeled. Either the compound, or the competitor, or both, 
is added first to the protein for a time sufficient to allow binding, if present. Incubations may 
be performed at a temperature which facilitates optimal activity, typically between 4 and 

20 40°C. Incubation periods are typically optimized, e.g., to facilitate rapid high throughput 
screening. Typically between 0.1 and 1 hour will be sufficient. Excess reagent is generally 
removed or washed away. The second component is then added, and the presence or absence 
of the labeled component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by the test 

25 compound. Displacement of the competitor is an indication that the test compound is binding 
to the prostate cancer protein and thus is capable of binding to, and potentially modulating, 
the activity of the prostate cancer protein. In this embodiment, either component can be 
labeled. Thus, e.g., if the competitor is labeled, the presence of label in the wash solution 
indicates displacement by the agent. Alternatively, if the test compound is labeled, the 

30 presence of the label on the support indicates displacement. 
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In an alternative embodiment, the test compound is added first, with 
incubation and washing, followed by the competitor. The absence of binding by the 
competitor may indicate that the test compound is bound to the prostate cancer protein with a 
higher affinity. Thus, if the test compound is labeled, the presence of the label on the 
5 support, coupled with a lack of competitor binding, may indicate that the test compound is 
capable of binding to the prostate cancer protein. 

In a preferred embodiment, the methods comprise differential screening to 
identity agents that are capable of modulating the activity of the prostate cancer proteins. In 
this embodiment, the methods comprise combining a prostate cancer protein and a competitor 
10 in a first sample. A second sample comprises a test compound, a prostate cancer protein, and 
a competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples .indicates the presence of an agent capable of 
binding to the prostate cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 
15 agent is capable of binding to the prostate cancer protein. 

Alternatively, differential screening is used to identify drug candidates that 
bind to the native prostate cancer protein, but cannot bind to modified prostate cancer 
proteins. The structure of the prostate cancer protein may be modeled, and used in rational 
drug design to synthesize agents that interact with that site. Drug candidates that affect the 
20 activity of a prostate cancer protein are also identified by screening drugs for the ability to 
either enhance or reduce the activity of the protein. 

Positive controls and negative controls may be used in the assays. Preferably 
control and test samples are performed in at least triplicate to obtain statistically significant 
results. Incubation of all samples is for a time sufficient for the binding of the agent to the 
25 protein. Following incubation, samples are washed free of non-specifically bound material 
and the amount of bound, generally labeled agent determined. For example, where a 
radiolabel is employed, the samples may be counted in a scintillation counter to determine the 
amount of bound compound. 

A variety of other reagents may be included in the screening assays. These 
30 include reagents like salts, neutral proteins, e.g. albumin, detergents, etc. which may be used 
to facilitate optimal protein-protein binding and/or reduce non-specific or background 
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interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 
5 compound capable of modulating the activity of a prostate cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising prostate cancer 
proteins. Preferred cell types include almost any cell. The cells contain a recombinant . 
nucleic acid that encodes a prostate cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

10 In one aspect, the assays are evaluated in the presence or absence or previous 

or subsequent exposure of physiological signals, e.g. hormones, antibodies, peptides, 
antigens, cytokines, growth factors, action potentials, pharmacological agents including 
chemotherapeutics, radiation, carcinogenics, or other cells (i.e. cell-cell contacts). In another 
example, the determinations are determined at different stages of the cell cycle process. 

15 In this way, compounds that modulate prostate cancer agents are identified. 

Compounds with pharmacological activity are able to enhance or interfere with the activity of 
the prostate cancer protein. Once identified, similar structures are evaluated to identify 
critical structural feature of the compound. 

In one embodiment, a method of inhibiting prostate cancer cell division is 

20 provided. The method comprises administration of a prostate cancer inhibitor. In another 
embodiment, a method of inhibiting prostate cancer is provided. The method comprises 
administration of a prostate cancer inhibitor. In a further embodiment, methods of treating 
cells or individuals with prostate cancer are provided- The method comprises administration 
of a prostate cancer inhibitor. 

25 In one embodiment, a prostate cancer inhibitor is an antibody as discussed 

above. In another embodiment, the prostate cancer inhibitor is an antisense molecule. 

A variety of cell growth, proliferation, and metastasis assays are known to 
those of skill in the art, as described below. 

Soft agar growth or colony formation in suspension 

30 Normal cells require a solid substrate to attach and grow. When the cells are 

transformed, they lose this phenotype and grow detached from the substrate. For example, 
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transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 
grow. Soft agar growth or colony formation in suspension assays can be used to identify. 
5 modulators of prostate cancer sequences, which when expressed in host cells, inhibit 

abnormal cellular proliferation and transformation. A therapeutic compound would reduce or 
eliminate the host cells' ability to grow in stirred suspension culture or suspended in semi- 
solid media, such as semi-solid or soft. 

Techniques for soft agar growth or colony formation in suspension assays are 
10 described in Freshney, Culture of Animal Cells a Manual of Basic Technique (3 rd ed., 1994), 
herein incorporated by reference. See also, the methods section of Garkavtsev et al (1996), 
supra, herein incorporated by reference. 

Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until 
15 they touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 
higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
20 pattern of normal surrounding cells. Alternatively, labeling index with ( 3 H)-thymidine at 

saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 
normal phenotype and become contact inhibited and would grow to a lower density. 

In this assay, labeling index with ( 3 H)-thymidine at saturation density is a 
25 preferred method of measuring density limitation of growth. Transformed host cells are 
transfected with a prostate cancer-associated sequence and are grown for 24 hours at 
saturation density in non-limiting medium conditions. The percentage of cells labeling with 
( 3 H)-thymidine is determined autoradiographically. See, Freshney (1994), supra. 
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Growth factor or serum dependence 

Transformed cells have a lower serum dependence than their normal 
counterparts {see, e.g., Temin, J. Natl Cancer InstL 37:167-175 (1966); Eagle et al, J. Exp. 
Med. 131:836-879 (1970)); Freshney, supra. This is in part due to release of various growth 
5 factors by the transformed cells. Growth factor or serum dependence of transformed host 
cells can be compared with that of control. 

Tumor specific markers levels 

Tumor cells release an increased amount of certain factors (hereinafter "tumor 
10 specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a higher level than from normal brain cells {see, e.g., 
Gullino, Angiogenesis, tumor vascularization, and potential interference with tumor growth. 
in Biological Responses in Cancer, pp. 178-184 (Mihich (ed.) 1985)). Similarly, Tumor 
angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal 
15 counterparts. See, e.g., Folkman, Angiogenesis and Cancer, Sem Cancer Biol. (1992)). 

Various techniques which measure the release of these factors are described in 
Freshney (1994), supra. Also, see, Unkless et al , J. Biol Chenu 249:4295-4305 (1974); 
Strickland & Beers, J. Biol Chem. 251:5694-5702 (1976); Whur et al, Br. J. Cancer 42:305- 
312 (1980); Gullino, Angiogenesis, tumor vascularization, and potential interference with 
20 tumor growth, in Biological Responses in Cancer, pp. 178-184 (Mihich (ed.) 1985); 
Freshney Anticancer Res. 5:lllrl30 (1985). 



Invasiveness into Matrigel 

The degree of invasiveness into Matrigel-or some other extracellular matrix 
25 constituent can be used as an assay to identify compounds that modulate prostate cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
invasiveness of cells into Matrigel or some other extracellular matrix constituent. In this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 
30 Techniques described in Freshney (1994), supra, can be used. Briefly, the 

level of invasion of host cells can be measured by using filters coated with Matrigel or some 
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other extracellular matrix constituent. Penetration into the gel, or through to the distal side of 
the filter, is rated as invasiveness, and rated histologically by number of cells and distance 
moved, or by prelabeling the cells with l25 I and counting the radioactivity on the distal side of 
the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 

5 

Tumor growth in vivo 

Effects of prostate cancer-associated sequences on cell growth can be tested in 
transgenic or immune-suppressed mice. Knock-out transgenic mice can be made, in which 
the prostate cancer gene is disrupted or in which a prostate cancer gene is inserted. Knock- 

10 out transgenic mice can be made by insertion of a marker gene or other heterologous gene 
into the endogenous prostate cancer gene site in the mouse genome via homologous 
recombination. Such mice can also be made by substituting the endogenous prostate cancer 
gene with a mutated version of the prostate cancer gene, or by mutating the endogenous 
prostate cancer gene, e.g., by exposure to carcinogens. 

15 A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 

containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 
that possess germ cells partially derived from the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 

20 lesion (see, e.g., Capecchi et al, Science 244:1288 (1989)). Chimeric targeted mice can be 
derived according to Hogan et al. , Manipulating the Mouse Embryo: A Laboratory Manual, 
Cold Spring Harbor Laboratory (1988) and Teratocarcinomas and Embryonic Stem Cells: A 
Practical Approach, Robertson, ed., ERL Press, Washington, D.C., (1987). 

Alternatively, various immune-suppressed or immune-deficient host animals 

25 can be used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella et al, J. 
Natl Cancer Inst. 52:921 (1974)), a SCED mouse, a thymectomized mouse, or an irradiated 
mouse (see, e.g., Bradley et al, Br. J. Cancer 38:263 (1978); Selby et al, Br. J. Cancer 
41:52 (1980)) can be used as a host. Transplantable tumor cells (typically about 10 6 cells) 
injected into isogenic hosts will produce invasive tumors in a high proportions of cases, while 

30 normal cells of similar origin will not. In hosts which developed invasive tumors, cells 
expressing a prostate cancer-associated sequences are injected subcutaneously. After a 
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suitable length of time, preferably 4-8 weeks, tumor growth is measured (e.g., by volume or 
by its two largest dimensions) and compared to the control. Tumors that have statistically 
significant reduction (using, e.g., Student's T test) are said to have inhibited growth. 

5 Methods of identifying variant prostate cancer-associated sequences 

Without being bound by theory, expression of various prostate cancer 
sequences is correlated with prostate cancer. Accordingly, disorders based on mutant or 
variant prostate cancer genes may be determined. In one embodiment, the invention provides 
methods for identifying cells containing variant prostate cancer genes, e.g., determining all or 

10 part of the sequence of at least one endogenous prostate cancer genes in a cell. This may be 
accomplished using any number of sequencing techniques. In a preferred embodiment, the 
invention provides methods of identifying the prostate cancer genotype of an individual, e.g., 
determining all or part of the sequence of at least one prostate cancer gene of the individual. 
This is generally done in at least one tissue of the individual, and may include the evaluation 

15 of a number of tissues or different samples of the same tissue. The method may include 
comparing the sequence of the sequenced prostate cancer gene to a known prostate cancer 
gene, i.e., a wild-type gene. 

The sequence of all or part of the prostate cancer gene can then be compared 
to the sequence of a known prostate cancer gene to determine if any differences exist. This 

20 can be done using any number of known homology programs, such as Bestfit, etc. In a 
preferred embodiment, the presence of a difference in the sequence between the prostate 
cancer gene of the patient and the known prostate cancer gene correlates with a disease state 
or a propensity for a disease state, as outlined herein. 

In a preferred embodiment, the prostate cancer genes are used as probes to 

25 determine the number of copies of the prostate cancer gene in the genome. 

In another preferred embodiment, the prostate cancer genes are used as probes 
to determine the chromosomal localization of the prostate cancer genes. Information such as 
chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
chromosomal abnormalities such as translocations, and the like are identified in the prostate 

30 cancer gene locus. 
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Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a prostate cancer 
protein or modulator thereof, is administered to a patient. By "therapeutically effective dose" 
herein is meant a dose that produces effects for which it is administered. The exact dose will 
5 depend on the purpose of the treatment, and will be ascertainable by one skilled in the art 
using known techniques (e.g., Ansel el ai, Pharmaceutical Dosage Fonns and Drug 
Delivery; Lieberman, Pharmaceutical Dosage Forms (vols. 1-3, 1992), Dekker, ISBN 
0824770846, 082476918X, 0824712692, 0824716981; Lloyd, The Art, Science and 
Technology of Pharmaceutical Compounding (1999); and Pickar, Dosage Calculations 

10 (1999)). As is known in the art, adjustments for prostate cancer degradation, systemic versus 
localized delivery, and rate of new protease synthesis, as well as the age, body weight, 
general health, sex, diet, time of administration, drug interaction and the severity of the 
condition may be necessary, and will be ascertainable with routine experimentation by those 
skilled in the art. U.S. Patent Application N. 09/687,576, further discloses the use of 

15 compositions and methods of diagnosis and treatment in prostate cancer is hereby expressly 
incorporated by reference. 

A "patient" for the purposes of the present invention includes both humans 
and other animals, particularly mammals. Thus the methods are applicable to both human 
therapy and veterinary applications. In the preferred embodiment the patient is a mammal, 

20 preferably a primate, and in the most preferred embodiment the patient is human. 

The administration of the prostate cancer proteins and modulators thereof of 
the present invention can be done in a variety of ways as discussed above, including, but not 
limited to, orally, subcutaneously, intravenously, intranasally, transdermally, 
intraperitoneally, intramuscularly, intrapulmonary, vaginally, rectally, or intraocularly. In 

25 some instances, e.g., in the treatment of wounds and inflammation, the prostate cancer 
proteins and modulators may be directly applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a prostate 
cancer protein in a form suitable for administration to a patient. In the preferred embodiment, 
the pharmaceutical compositions are in a water soluble form, such as being present as 

30 pharmaceutical^ acceptable salts, which is meant to include both acid and base addition 
salts. "Pharmaceutically acceptable acid addition salt" refers to those salts that retain the 
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biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 
sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
5 acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 

methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
"Pharmaceutically acceptable base addition salts" include those derived from inorganic bases 
such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 

10 potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

15 The pharmaceutical compositions may also include one or more of the 

following: carrier proteins such as serum albumin; buffers; fillers such as microcrystalline 
cellulose, lactose, corn and other starches; binding agents; sweeteners and other flavoring 
agents; coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit 

20 dosage forms depending upon the method of administration. For example, unit dosage forms 
suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that prostate cancer protein modulators (e.g., antibodies, 
antisense constructs, ribozymes, small organic molecules, etc.) when administered orally, 
should be protected from digestion. This is typically accomplished either by complexing the 

25 molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 
packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise a prostate 
cancer protein modulator dissolved in a pharmaceutically acceptable carrier, preferably an 

30 aqueous carrier. A variety of aqueous carriers can be used, e.g., buffered saline and the like. 
These solutions are sterile and generally free of undesirable matter. These compositions may 
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be sterilized by conventional, well known sterilization techniques. The comppsitions may 
contain pharmaceutical^ acceptable auxiliary substances as required to approximate 
physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents 
and the like, e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, 
5 sodium lactate and the like. The concentration of active agent in these formulations can vary 
widely, and will be selected primarily based on fluid volumes, viscosities, body weight and 
the like in accordance with the particular mode of administration selected and the patient's 
needs (e.g., Remington 's Pharmaceutical Science (15th ed., 1980) and Goodman & Gillman, 
Tire Pharmacologial Basis of Therapeutics (Hardman et a/.,eds., 1996)). 

10 Thus, a typical pharmaceutical composition for intravenous administration 

would be about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per 
patient per day may be used, particularly when the drug is administered to a secluded site and 
not into the blood stream, such as into a body cavity or into a lumen of an organ. 
Substantially higher dosages are possible in topical administration. Actual methods for 

15 preparing parenterally administrable compositions will be known or apparent to those skilled 
in the art, e.g., Remington's Pharmaceutical Science and Goodman and Gillman, The 
Pharmacologial Basis of Therapeutics, supra. 

The compositions containing modulators of prostate cancer proteins can be 
administered for therapeutic or prophylactic treatments. In therapeutic applications, 

20 compositions are administered to a patient suffering from a disease (e.g., a cancer) in an 
amount sufficient to cure or at least partially arrest the disease and its complications. An 
amount adequate to accomplish this is defined as a "therapeutically effective dose." Amounts 
effective for this use will depend upon the severity of the disease and the general state of the 
patient's health. Single or multiple administrations of the compositions may be administered 

25 depending on the dosage and frequency as required and tolerated by the patient. In any event, 
the composition should provide a sufficient quantity of the agents of this invention to 
effectively treat the patient. An amount of modulator that is capable of preventing or slowing 
the development of cancer in a mammal is referred to as a "prophylactically effective dose." 
The particular dose required for a prophylactic treatment will depend upon the medical 

30 condition and history of the mammal, the particular cancer being prevented, as well as other 
factors such as age, weight, gender, administration route, efficiency, etc. Such prophylactic 

85 



WO 02/30268 



PCT/US01/32045 



treatments may be used, e.g., in a mammal who has previously had cancer to prevent a 
recurrence of the cancer, or in a mammal who is suspected of having a significant likelihood 
of developing cancer. 

It will be appreciated that the present prostate cancer protein-modulating 
5 compounds can be administered alone or in combination with additional prostate cancer 
modulating compounds or with other therapeutic agent, e.g., other anti-cancer agents or 
treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in Tables 1-16, such as antisense polynucleotides 

10 or ribozymes, will be introduced into cells, in vitro or in vivo. The present invention provides 
methods, reagents, vectors, and cells useful for expression of prostate cancer-associated 
polypeptides and nucleic acids using in vitro (cell-free), ex vivo or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 

15 expression of a protein or nucleic acid is application specific. Many procedures for 

introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
plasma vectors, viral vectors and any of the other well known methods for introducing cloned 
genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell {see, 

20 e.g., Berger & Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology 
volume 152 (Berger), Ausubel et al, eds., Current Protocols (supplemented through 1999), 
and Sambrook et al> Molecular Cloning - A Laboratory Manual (2nd ed., Vol. 1-3, 1989. 

In a preferred embodiment, prostate cancer proteins and modulators are 
administered as therapeutic agents, and can be formulated as outlined above. Similarly, 

25 prostate cancer genes (including both the full-length sequence, partial sequences, or 

regulatory sequences of the prostate cancer coding regions) can be administered in a gene 
therapy application. These prostate cancer genes can include antisense applications, either as 
gene therapy (i.e. for incorporation into the genome) or as antisense compositions, as will be 
appreciated by those in the art. 

30 Prostate cancer polypeptides and polynucleotides can also be administered as 

vaccine compositions to stimulate HTL, CTL and antibody responses.. Such vaccine 
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compositions can include, e.g., lipidated peptides (see, e.g.,Vitiello, A. et al. y J. Clin. Invest. 
95:341 (1995)), peptide compositions encapsulated in poly(DL-lactide-co-glycolide) ("PLG") 
microspheres (see, e.g., Eldridge, et al y Molec. Immunol 28:287-294, (1991); Alonso et al, 
Vaccine 12:299-306 (1994); Jones et ai, Vaccine 13:675-681 (1995)), peptide compositions 
5 contained in immune stimulating complexes (ISCOMS) (see, e.g., Takahashi et al y Nature 
344:873-875 (1990); Hu et ai, Clin Exp Immunol 113:235-243 (1998)), multiple antigen 
peptide systems (MAPs) (see, e.g., Tarn, Proc. Natl Acad. ScL U.SA. 85:5409-5413 (1988); 
Tarn, J. Immunol. Methods 196:17-32 (1996)), peptides formulated as multivalent peptides; 
peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 

10 vectors (Perkus, et al. y In: Concepts in vaccine development (Kaufmann, ed., p. 379, 1996); 
Chakrabarti, et al, Nature 320:535 (1986); Hu et al y Nature 320:537 (1986); Kieny, et al y 
AIDS Bio/Technology 4:790 (1986); Top et al y J. Infect. Dis. 124:148 (1971); Chanda et al y 
Virology 175:535 (1990)), particles of viral or synthetic origin (see, e.g. y Kofler et al. y J. 
Immunol Methods. 192:25 (1996); Eldridge et al, Senu Hematol 30:16 (1993); Falo et al y 

15 Nature Med. 7:649 (1995)), adjuvants (Warren et al y Annu. Rev. Immunol 4:369 (1986); 
Gupta et al, Vaccine 11:293 (1993)), liposomes (Reddy et al y J. Immunol 148:1585 (1992); 
Rock, Immunol Today 17:131 (1996)), or, naked or particle absorbed cDNA (Ulmer, et al y 
Science 259:1745 (1993); Robinson et al. y Vaccine 11:957 (1993); Shiver et al y In: Concepts 
in vaccine development (Kaufmann, ed., p. 423, 1996); Cease & Berzofsky, Annu. Rev. 

20 Immunol. 12:923 (1994) and Eldridge et al, Sem. Hematol 30:16 (1993)). Toxin-targeted 
delivery technologies, also known as receptor mediated targeting, such as those of Avant 
Immunotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a 
substance designed to protect the antigen from rapid catabolism, such as aluminum hydroxide 

25 or mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 
Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 
as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 
MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 
Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 

30 aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 
tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 
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polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 
Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 

Vaccines can be administered as nucleic acid compositions wherein DNA or 
5 RNA encoding one or more of the polypeptides, or a fragment thereof, is administered to a 
patient. This approach is described, for instance, in Wolff et. al, Science 247:1465 (1990) as 
well as U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; 
WO 98/04720; and in more detail below. Examples of DNA-based delivery technologies 
include "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, 

10 cationic lipid complexes, and particle-mediated ("gene gun") or pressure-mediated delivery 
(see, e.g., U.S. Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the 
invention can be expressed by viral or bacterial vectors. Examples of expression vectors 
include attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 

15 vaccinia virus, e.g., as a vector to express nucleotide sequences that encode prostate cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useful in immunization protocols are described in, e.g.* U.S. 
Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are 

20 described in Stover et al, Nature 351:456-460 (1991). A wide variety of other vectors useful 
for therapeutic administration or immunization e.g. adeno and adeno-associated virus vectors, 
retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the like, 
will be apparent to those skilled in the art from the description herein (see, e.g., Shata et al, 
Mol Med Today 6:66-71 (2000); Shedlock et aUJUukoc Biol 68:793-806 (2000); Hipp et 

25 al, In Vivo 14:571-85 (2000)). 

Methods for the use of genes as DNA vaccines are well known, and include 
placing a prostate cancer gene or portion of a prostate cancer gene under the control of a 
regulatable promoter or a tissue-specific promoter for expression in a prostate cancer patient. 
The prostate cancer.gene used for DNA vaccines can encode full-length prostate cancer 

30 proteins, but more preferably encodes portions of the prostate cancer proteins including 
peptides derived from the prostate cancer protein. In one embodiment, a patient is 
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immunized with a DNA vaccine comprising a plurality of nucleotide sequences derived from 
a prostate cancer gene. For example, prostate cancer-associated genes or sequence encoding 
subfragments of a prostate cancer protein are introduced into expression vectors and tested 
for their immunogenicity in the context of Class I MHC and an ability to generate cytotoxic T 
5 cell responses. This procedure provides for production of cytotoxic T cell responses against 
cells which present antigen, including intracellular epitopes. 

In a preferred embodiment, the DNA vaccines include a gene encoding an 
adjuvant molecule with the DNA vaccine. Such adjuvant molecules include cytokines that 
increase the immunogenic response to the prostate cancer polypeptide encoded by the DNA 

10 vaccine. Additional or alternative adjuvants are available. 

In another preferred embodiment prostate cancer genes find use in generating 
animal models of prostate cancer. When the prostate cancer gene identified is repressed or 
diminished in cancer tissue, gene therapy technology, e.g., wherein antisense RNA directed 
to the prostate cancer gene will also diminish or repress expression of the gene. Animal 

15 models of prostate cancer find use in screening for modulators of a prostate cancer-associated 
sequence or modulators of prostate cancer. Similarly, transgenic animal technology 
including gene knockout technology, e.g. as a result of homologous recombination with an 
appropriate gene targeting vector, will result in the absence or increased expression of the 
prostate cancer protein. When desired, tissue-specific expression or knockout of the prostate 

20 cancer protein may be necessary. 

It is also possible that the prostate cancer protein is overexpressed in prostate 
cancer. As such, transgenic animals can be generated that overexpress the prostate cancer 
protein. Depending on the desired expression level, promoters of various strengths can be 
employed to express the transgene. Also, the number of copies of the integrated transgene 

25 can be determined and compared for a determination of the expression level of the transgene. 
Animals generated by such methods find use as animal models of prostate cancer and are 
additionally useful in screening for modulators to treat prostate cancer. 

Kits for Use in Diagnostic and/or Prognostic Applications 

30 For use in diagnostic, research, and therapeutic applications suggested above, 

kits are also provided by the invention. In the diagnostic and research applications such kits 
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may include any or all of the following: assay reagents, buffers, prostate cancer-specific 
nucleic acids or antibodies, hybridization probes and/or primers, antisense polynucleotides, 
ribozymes, dominant negative prostate cancer polypeptides or polynucleotides, small 
molecules inhibitors of prostate cancer-associated sequences etc. A therapeutic product may 
5 include sterile saline or another pharmaceutically acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing directions 
(i.e., protocols) for the practice of the methods of this invention. While the instructional 
materials typically comprise written or printed materials they are not limited to such. Any 
medium capable of storing such instructions and communicating them to an end user is 
10 contemplated by this invention. Such media include, but are not limited to electronic storage 
media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the 
like. Such media may include addresses to internet sites that provide such instructional 
materials. 

The present invention also provides for kits for screening for modulators of 
15 prostate cancer-associated sequences. Such kits can be prepared from readily available 
materials and reagents. For example, such kits can comprise one or more of the following 
materials: a prostate cancer-associated polypeptide or polynucleotide, reaction tubes, and 
instructions for testing prostate cancer-associated activity. Optionally, the kit contains 
biologically active prostate cancer protein. A wide variety of kits and components can be 
20 prepared according to the present invention, depending upon the intended user of the kit and 
the particular needs of the user. Diagnosis would typically involve evaluation of a plurality 
of genes or products. The genes will be selected based on correlations with important 
parameters in disease which may be identified in historical or outcome data. 

25 
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EXAMPLES 

Example 1: Tissue Preparation, Labeling Chips, and Fingerprints 

5 Purifying total RNA from tissue sample using TRIzol Reagent 

The sample weight is first estimated. The tissue samples are homogenized in 
1 ml of TRIzol per 50 mg of tissue using a homogenizer (e.g., Polytron 3100). The size of 
the generator/probe used depends upon the sample amount. A generator that is too large for 
the amount of tissue to be homogenized will cause a loss of sample and lower RNA yield. A 
10 larger generator (e.g., 20 mm) is suitable for tissue samples weighing more than 0.6 g. Fill 
tubes should not be overfilled. If the working volume is greater than 2 ml and no greater than 
10 ml, a 15 ml polypropylene tube (Falcon 2059) is suitable for homogenization. 

Tissues should be kept frozen until homogenized. The TRIzol is added 
directly to the frozen tissue before homogenization. Following homogenization, the insoluble 
15 material is removed from the homogenate by centrifugation at 7500 x g for 15 min. in a 

Sorvall superspeed or 12,000 x g for 10 min. in an Eppendorf centrifuge at 4°C. The cleared 
homogenate is then transferred to a new tube(s). Samples may be frozen and stored at -60 to 

-70°C for at least one month or else continue with the purification. 

The next process is phase separation. The homogenized samples are incubated 

20 for 5 minutes at room temperature. Then, 0.2 ml of chloroform per 1ml of TRIzol reagent is 
added to the homogenization mixture. The tubes are securely capped and shaken vigorously 
by hand (do not vortex) for 15 seconds. The samples are then incubated at room temp, for 
2-3 minutes and next centrifuged at 6500 rpm in a Sorvall superspeed for 30 min. at 4oC. 

The next process is RNA Precipitation. The aqueous phase is transferred to a 

25 fresh tube. The organic phase can be saved if isolation of DNA or protein is desired. Then 
0.5 ml of isopropyl alcohol is added per 1ml of TRIzol reagent used in the original 
homogenization. Then, the tubes are securely capped and inverted to mix. The samples are 
then incubated at room temp, for 10 minutes an centrifuged at 6500 rpm in Sorvall for 20 
min. at 4°C. 
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The RNA is then washed. The supernatant is poured off and the pellet washed 
with cold 75% ethanol. 1 ml of 75% ethanol is used per 1 ml of the TRIzol reagent used in 
the initial homogenization. The tubes are capped securely and inverted several times to 
loosen pellet without vortexing . They are next centrifuged at <8000 rpm (<7500 x g) for 5 
5 minutes at 4°C. 

The RNA wash is decanted. The pellet is carefully transferred to an 
Eppendorf tube (sliding down the tube into the new tube by use of a pipet tip to help guide it 
in if necessary). Tube(s) sizes for precipitating the RNA depending on the working volumes. 
Larger tubes may take too long to dry. Dry pellet. The RNA is then resuspended in an 
10 appropriate volume (e.g., 2 -5 ug/ul) of DEPC H 2 0. The absorbance is then measured. 

The poly A+ mRNA may next be purified from total RNA by other methods 
such as Qiagen* s RNeasy kit. The poly A + mRNA is purified from total RNA by adding the 
oligotex suspension which has been heated to 37°C and mixing prior to adding to RNA. 
The Elution Buffer is incubated at 70°C. If there is precipitate in the buffer, warm up the 2 x 
15 Binding Buffer at 65°C. The the total RNA is mixed with DEPC-treated water, 2 x Binding 
Buffer, and Oligotex according to Table 2 on page 16 of the Oligotex Handbook and next 
incubated for 3 minutes at 65°C and 10 minutes at room temperature. 

The preparation is centrifuged for 2 minutes at 14,000 to 18,000 g, preferably, 
at a "soft setting," The supernatant is removed without disturbing Oligotex pellet. A little bit 
20 of solution can be left behind to reduce the loss of Oligotex. The supernatant is saved until 

satisfactory binding and elution of poly A + mRNA has been found. 

Then, the preparation is gently resuspended in Wash Buffer OW2 and pipetted 
onto the spin column and centrifuged at full speed (soft -setting if possible) for 1 minute. 

Next, the spin column is transferred to a new collection tube and gently 
25 resuspended in Wash Buffer OW2 and centrifuged as described herein. 

Then, the spin column is transferred to a new tube and eluted with 20 to 100 ul 
of preheated (70°C) Elution Buffer. The Oligotex resin is gently resuspended by pipetting up 
and down. The centrifugation is repeated as above and the elution repeated with fresh elution 
buffer or first eluate to keep the elution volume low. 
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The absorbance is next read to determine the yield, using diluted Elution 
Buffer as the blank. 

Before proceeding with cDNA synthesis, the mRNA is precipitated before 
proceeding with cDNA synthesis, as components leftover or in the Elution Buffer from the 
5 Oligotex purification procedure will inhibit downstream enzymatic reactions of the mRNA. 
0.4 vol. of 7.5 M NH40Ac + 2.5 vol. of cold 100% ethanol is added and the preparation 
precipitated at -20°C 1 hour to overnight (or 20-30 min. at -70°C), and centrifuged at 
14,000-16,000 x g for 30 minutes at 4°C. Next, the pellet is washed with 0.5 ml of 80% 
ethanol (-20°C) and then centrifuged at 14,000-16,000 x g for 5 minutes at room temperature. . 
10 The80% ethanol wash is then repeated. The last bit of ethanol from the pellet is then dried 
without use of a speed vacuum and the pellet is then resuspended in DEPC H 2 0 at lug/ul 
concentration. 

Alternatively the RNA may be purified using other methods (e.g., Qiagen's RNeasv kit). 

15 No more than 100 ug is added to the RNeasy column. The sample volume is 

adjusted to 100 ul with RNase-free water. 350 ul Buffer RLT and then 250 ul ethanol 
(100%) are added to the sample. The preparation is then mixed by pipetting and applied to an 
RNeasy mini spin column for centrifugation (15 sec at >10,000 rpm). If yield is low, reapply 
the flowthrough to the column and centrifuge again. 

20 Then, transfer column to a new 2 ml collection tube and add 500 ul Buffer 

RPE and centrifuge for 15 sec at >10,000rpm. The flowthrough is discarded. 500 ul Buffer 
RPE and is then added and the preparation is centriuged for 15 sec at >10,000 rpm. The 
flowthrough is discarded, and the column membrane dried by centrifuging for 2 min at 
maximum speed. The column is transferred to a new 1.5-ml collection tube. 30-50 ul of 

25 RNase-free water is applied directly onto column membrane. The column is then centrifuged 
for 1 min at >10,000 rpm and the elution step repeated. 

The absorbance is then read to determine yield. If necessary, the material may 
be ethanol precipitated with ammonium acetate and 2.5X volume 100% ethanol. 

30 First Strand cDNA Synthesis 
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The first strand can be make using using Gibco's "Superscript Choice System 
for cDNA Synthesis" kit. The starting material is 5 ug of total RNA or 1 ug of polyA+ 
mRNAl. For total RNA, 2 ul of Superscript RT is used; for polyA+ mRNA, 1 ul of 
Superscript RT is used. The final volume of first strand synthesis mix is 20 ul. The RNA 
5 should be in a volume no greater than 10 ul. The RNA is incubated with 1 ul of 100 pmol 
T7-T24 oligo for 10 min at 70°C followed by addition on ice of 7 ul of: 4ul 5X 1 st Strand 
Buffer, 2 ul of 0.1M DTT, and 1 ul of lOmM dNTP mix. The preparation is then incubated at 
37°C for 2 min before addition of the Superscript RT followed by incubation at 37°C for 1 
hour. 

10 

Second Strand Synthesis 

For the second strand synthesis, place 1st strand reactions on ice and add: 91 
ul DEPC H 2 0; 30 ul 5X 2nd Strand Buffer; 3 ul lOmM dNTP mix; 1 ul 10 U/ul E.coli DNA 
Ligase; 4 ul 10 U/ul E.coli DNA Polymerase; and 1 ul 2 U/ul RNase H. Mix and incubate 2 
15 hours at 16°C Add 2 ul T4 DNA Polymerase. Incubate 5 min at 16°C. Add 10 ul of 0.5M 
EDTA. 

Cleaning up cDNA 

The cDNA is purified using Phenol:Chloroform:Isoamyl Alcohol (25:24:1) 

20 and Phase-Lock gel tubes. The PLG tubes are centrifuged for 30 sec at maximum speed. 
The cDNA mix is then transferred to PLG tube. An equal volume of 
phenol:chloroform:isamyl alcohol is then added, the preparation shaken vigorously (no 
vortexing), and centrifuged for 5 minutes at maximum speed. The top aqueous solution is 
transferred to a new tube and ethanol precipitated by adciing 7.5X 5M NH40Ac and 2.5X 

25 volume of 100% ethanol. Next, it is centrifuged immediately at room temperature for 20 

min, maximum speed. The supernatant is removed, and the pellet washed with 2X with cold 
80% ethanol. As much ethanol wash as possible should be removed before air drying the 
pellet; and resuspending it in 3 ul RNase-free water. 



94 



WO 02/30268 



PCT/US01/32045 



In vitro Transcription (TVT) and labeling with biotin 

In vitro Transcription (IVT) and labeling with biotin is performed as follows: 
Pipet 1.5 ul of cDNA into a thin-wall PCR tube. Make NTP labeling mix by combining 2 ul 
T7 lOxATP (75 mM) (Ambion); 2 ul T7 lOxGTP (75 mM) (Ambion); 1.5 ul T7 lOxCTP (75 
5 mM) (Ambion); 1.5 ul T7 lOxUTP (75 mM) (Ambion); 3.75 ul 10 mM Bio-1 1-UTP 

(Boehringer-Mannheim/Roche orEnzo); 3.75 ul 10 mM Bio-16-CTP (Enzo); 2 ul lOx T7 
transcription buffer (Ambion); and 2 ul lOx T7 enzyme mix (Ambion). The final volume is 
20 ul. Incubate 6 hours at 37°C in a PCR machine. The RNA can be furthered cleaned. 
Clean-up follows the previous instructions for RNeasy columns or Qiagen's RNeasy protocol 

10 handbook. The cRNA often needs to be ethanol precipitated by resuspension in a volume 
compatible with the fragmentation step. 

Fragmentation is performed as follows. 15 ug of labeled RNA is usually 
fragmented. Try to minimize the fragmentation reaction volume; a 10 ul volume is 
recommended but 20 ul is all right. Do not go higher than 20 ul because the magnesium in 

15 the fragmentation buffer contributes to precipitation in the hybridization buffer. Fragment 
RNA by incubation at 94 C for 35 minutes in 1 x Fragmentation buffer (5 x Fragmentation 
buffer is 200 mM Tris-acetate, pH 8.1; 500 mM KOAc; 150 mM MgOAc). The labeled 
RNA transcript can be analyzed before and after fragmentation. Samples can be heated to 
65°C for 15 minutes and electrophoresed on 1% agarose/TBE gels to get an approximate idea 

20 of the transcript size range. 

For hybridization, 200 ul (10 ug cRNA) of a hybridization mix is put on the 
chip. If multiple hybridizations are to be done (such as cycling through a 5 chip set), then it 
is recommended that an initial hybridization mix of 300 ul or more be made. The 
hybridization mix is: fragment labeled RNA (50 ng/ul final cone); 50 pM 948-b control 

25 oligo; 1.5 pM BioB; 5 pM BioC; 25 pM BioD; 100 pM CRE; 0.1 mg/ml herring sperm DNA; 
0.5 mg/ml acetylated BSA; and 300 ul with lxMES hyb buffer. 

The hybridization reaction is conducted with non-biotinylated IVT (purified 
by RNeasy columns) (see example 1 for steps from tissue to IVT): The following mixture is 
prepared: 
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IVT antisense RNA; 4 \ig: \xl 
Random Hexamers (1 jig/til): 4 
H 2 0: til 

14 ^1 

5 Incubate the above 14 \il mixture at 70°C for 10 min.; then put on ice. 

The Reverse transcription procedure uses the following mixture: 
0.1MDTT: 3\il 
SOXdNTPmix: 0.6 
H 2 0: 2.4 1*1 

10 Cy3orCy5dUTP(lmM): 3 jxl 

SS RT II (BRL): I \i\ 



The above solution is added to the hybridization reaction and incubated for 30 min., 42°C. 
15 Then, 1 fil SSII is added and incubated for another hour before being placed on ice. 

The SOX dNTP mix contains 25mM of cold dATP, dCTP, and dGTP, lOmM 

of dTTP and is made by adding 25 \i\ each of lOOmM dATP, dCTP, and dGTP; 10 jxl of 

lOOmM dTTP to 15 *il H 2 0. ] 

RNA degradation is performed as follows. Add 86 \i\ H20, 1.5 ul 1M NaOH/ 
20 2 mM EDTA and incubate at 65°C, 10 min.. For U-Con 30, 500 fxl TE/sample spin at 7000 g 

for 10 min, save flow through for purification. For Qiagen purification, suspend u-con 

recovered material in 500 \il buffer PB and proceed using Qiagen protocol. For DNAse 

digestion, add 1 ul of 1/100 dilution of DNAse/30 ul Rx and incubate at 37°C for 15 min. 

Incubate at 5 min 95°C to denature the DNAse. 

25 

Sample preparation 

For sample preparation, add Cot-1 DNA, 10 50X dNTPs, 1 |xl; 20X SSC, 
2.3 jd; Na pyro phosphate, 7.5 10 mg/ml Herring sperm DNA; 1 ul of 1/10 dilution to 
21.8 final vol. Dry in speed vac. Resuspend in 15 \il H 2 0. Add 0.38 *d 10% SDS. Heat 
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95°C, 2 min and slow cool at room temp, for 20 min. Put on slide and hybridize overnight at 
64°C. Washing after the hybridization: 3X SSC/0.03% SDS: 2 min., 37.5 ml 20X 
SSC+0.75ml 10% SDS in 250ml H 2 0; IX SSC: 5 min., 12.5 ml 20X SSC in 250ml H 2 0; 
0.2X SSC: 5 min., 2.5 ml 20X SSC in 250ml H 2 0. Dry slides and scan at appropiate PMT's 
5 and channels. 

Example 2: Taxol resistant Xenograft Model of Human Prostate Cancer 

Treatment regimens that include paclitaxel (Taxol; Bristol-Myers Squibb 
10 Company, Princeton, NJ) have been particularly successful in treating hormone-refractory 
prostate cancer in the phase II setting (Smith et al., Semin. Oncol. 26(1 Suppl 2): 109-11 
(1999)). However, many patients develop tumors which are initially, or later become, 
resistant to taxol. To identify genes that may be involved with resistance to taxol, or are 
regulated in response to taxol resistance, and therefore may be used to treat, or identify, taxol 
15 resistance in patients, the following experiments were carried out. 

The androgen-independent human cell line CWR22R was grown as a 
xenograft in nude mice (Nagabhushan et ah, Cancer Res. 56(13):3042-3046 (1996); Agus et 
al., J. Natl. Cancer Inst.91(21): 1869-1876 (1999); Bubendorf et al., J. Natl. Cancer Inst. 

20 91(20):1758-1764 (1999)). Initially, these xenograft tumors were sensitive to therapeutic 
doses of taxol. The mice were treated continuously with sub-therapeutic doses, and the 
tumors were allowed to grow for 3-4 weeks, before surgical removal of the tumors. The 
tumor from an individual mouse was then minced, and a small portion was then injected into 
a healthy nude mouse, establishing a second 

25 passage of the tumor. This mouse was then treated continuously with the 

same sub-therapeutic dose of taxol. This process was repeated 14 times, and a portion of 
each generation of xenograft tumor was collected. There was increasing resistance to 
therapeutic doses of taxol with each generation. Bythe end of the process, the tumors were 
fully resistant to therapeutic doses of taxol. RNA from each generation of tumor was then 

30 isolated, and individual mRNA species were quantified using a custom Affymetrix 

GeneChip® oligonucleotide microarray, with probes to interrogate approximately 35,000 
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unique mRNA transcripts. Genes were selected that showed a statistically significant up- 
regulation, or down-regulation, during the subsequent generations of increasingly taxol- 
resistant tumors. Only one gene was significantly up-regulated, whereas 24 genes were 
down-regulated; these are presented in Table 10. 
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The gene sequences identified to be overexpressed in prostate cancer may be 
used to identify coding regions from the public DNA database. The sequences may be used 
to either identify genes that encode known proteins, or they may be used to predict the coding 
regions from genomic DNA using exon prediction algorithms, such as FGENESH (Salamov 
and Solovyev, 2000, Genome Res. 10:516-522). In addition, one of ordinary skill in the art 
would understand how to obtain the unigene cluster identification and sequence information 
according to the exemplar accession numbers provided in Tables 1-16. (see, 
http://www.ncbi.nlm.nih.gov/UniGene/). 
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TABLE1 1 shows genes, including expression sequence tags, differentially expressed in 
prostate tumor tissue compared to normal tissue as analyzed using the Affymetrix/Eos HuOl 
GeneChip array. Shown are the relative amounts of each gene expressed in prostate tumor 
samples and various normal tissue samples showing the highest expression of the gene. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnfgenelD: Unigene number 

Unigene Title: Unigene gene title 

R1 : Ratio of tumor to normal body tissue 
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He T*tft3 AA0mi63 


iinetrpam rpniilatnrv ptpmpnt hinrflnn Drot 


6.5 


■10007c He 79^9 T9^R70 


FST*; 

UO 1 9 


6.4 


1ft»W7ft Hefi7fifl AA9365RQ 
I UoO/ D rio.O/QO r\n£OQouo 


ESTs* Wpaklv similar to nBUronal thread 

l_w I o, ifcaruj annual iv iicuiumoi uhvuu 


6.4 


10AA7v! He 9A9AQ AA0fW'>97 


ESTs 


6.4 


100797 Hc*W47flfi V 07900 


Human HP 19 nnnp mRNA 


6.3 


I0U13U rid. 131 10 AiUUUJf O 


hr»mrinpntisatft 1 *9-dinwnflnflSfl fhomonsnti 


6.3 


191770 Hc97R49R AA491714 


Hnmn santans mRNA for KIAA0896 orotein* 


6.3 


19^7R HC950R9Q AA5QQ9A7 


FSTs- Wealdv similar to ANKYRiN' GRAIN V 

Colo, vvaaiuy ouuiiai w ni wm nii'i ui wici » 


6.3 


Up OflCCM ADfW1CQ/l 

looUbl MS^yoWo AoUUUOoh 


nr/iet-alo HiflarafiHafinn fortr>r 
piOSlaic UlllclollUaUUil IdtiUJ 


6.3 


11&I9Q We 07QQ90 AAR0Q710 


F*TTe* Wpaklv similar tn similar to GTP-b 

Colo, iVOuNj ouiiiiai u/OMiuiai iuuii 
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1A1900 He ft7Q I OQ/Ylfl 


oUIDUUJ Uoiiyuiuyciiaoc 


62 




Colo 


62 


4 A A 99Cf>9Q 


CCT07W3 n.archaUum II Hnmo eanlone rfiMA 
Co It/ oOO vrtjftJOciitwn II nuilffJ oapiciu WJIVrt 
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19777C He 170QTW HHAIOR 


F^Tc* Wppkh/ similar tn Wpflinp nnt ava 
Co l o, vvcaruj otnuiai \\t yuuiiiiio iivi ava 
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Co 1 o 


6.1 


13i4o3 nS^7i4 Af414^ 


fnr^hoorl /nrne/\rthiki\>[*Vp 1 
lOiKnoaU \UIUoopiUklj" , lll r ic 1 


- 6.1 


132116 HS.402B9 A/N2o4/67 


CO IS 


g 


KOODOO Ue OAQUO A ftHKO/UYl 

130328 HS203213 AAUW4U0 


Co IS 


5.9 


11RV7 Hc79Qftft AA9R17Q3 

11300/ nS./fc90O MAtO 1 f ao 


ESTs 


5.8 


105496 Hs.301997 AA256323 


ESTs 


5.7 


116334 Hs.48948 AA491457 


ESTs 


5.7 


107968 Hs.61539 AA034020 


ESTs 


5.7 


120132 Hs.125019 Z38839 


ESTs; Weakly similar to »!! ALU SUBFAM1 


5.6 


106375 Hs.289072 AA443993 


ESTs 


5.6 


132550 Hs.170195 AA029597 


bone morphogenetic protein 7 (osteogenic 


5.6 


124777 Hs.140237 R41933 


ESTs; Weakly similar to neuronal thread 


5.6 


100311 Hs.337616 D50640 


phosphodiesterase 3B; cGMP-inhibited 


5.6 


101791 Hs.62354 M83822 


Human beige-like protein (BQL) mRNA; par 


5.5 


117698 Hs.45107 N41002 


ESTs 


5.5 


132387 Hs.281434 R70914 


heat shock 70kO protein 1 


5.5 


122041 Hs.98732 AA431407 


Homo sapiens Chromosome 16 BAC done CIT 


5.5 


133723 Hs£62476 AAD88851 


S-adenosytmethionine decarboxylase 1 
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1 13938 W81598 ESTs 5.4 

133015 Hs.246315 AA047036 ESTs 5.4 

125745 Hs.75722 AI283493 ribophorinll 5.4 

107295 Hs.80120 T34527 UDP-N-acety^Ipha-D^alactosamine^olyp 5.4 

5 108186 Hs.7780 AA056482 ESTs 5.3 

100184 Hs.21223 D17408 catponinl; basic; smooth muscle 5.3 

104466 Hs.326392 N25110 Human guanine nucleotide exchange factor 5.3 

104033 Hs.98944 AA365031 ESTs 5.3 

110844 Hs.167531 N31952 ESTs; WeaWy similar to (define not ava 5.3 

10 129056 Hs.108336 H70627 ESTs; Weakly similar to !!!! ALU SUBFAMI 5.3 

102805 Hs£5351 U90304 iroquois-dass homeodomain protein 5.3 

133493 Hs.194369 AA284143 Homo sapiens chromosome 1 atrophin-1 rel 5.3 

129184 Hs.109201 W26769 ESTs; Highly similar to (deffine not ava 52 

134158 Hs.79428 U15174 BCl2/adenovirusE1B 19kD-interactingpro $2 

15 107240 Hs.159872 D59368 ESTs 5.2 

104787 AA027317 ESTs; Weakly similar to !!!! ALU SUBFAMI 5.2 

123527 Hs.108327 AA608679 damage-specific ONA binding protein 1 (1 52 

116646 Hs.1 94228 F03048 ESTs; Moderately similar to till ALU SUB 52 

101448 Hs.195850 M21389 keratin 5 (epkJermolysis bullosa simplex 5.1 

20 116188 Hs.184598 AA464728 ESTs; WeaWy simitar to «!! ALU SUBFAMI 5.1 

126259 HS281428 Z21472 ESTs; Moderately similar to !!!! ALU SUB 5.1 

105921 Hs.169119 AA402613 ESTs 5.1 

103375 H&54416 X91868 sine oculis homeobox (Drosophrfa) homolo 5.1 

128871 Hs.106778 AA400271 ESTs; Highly similar to (defline not ava 5.1 

25 112681 Hs.148932 R87331 ESTs; Moderately similar to semaphorin V 5.1 

105784 Hs.226434 AA350771 ESTs 5.1 

116238 Hs.47144 AA479362 ESTs 5 

102913 Hs.80342 X07696 keratin 15 5 

103011 Hs.326035 X52541 early growth response 1 5 

30 126023 H58881 yr36d09.rl Soares fetal fiver spleen INF 5 

103709 Hs.13804 AA037316 ESTs 5 

118981 Hs.39288 N93839 ESTs; Weakly similar to !!» ALU SUBFAMI 5 

134807 Hs.89732 X78932 zinc finger protein 273 5 

100079 Hs.23311 AB002365 Human mRNA for KIAA0367 gene; partial cd 4.9 

35 132047 HS.3796 D83492 EphB6 4.9 

132880 Hs.177537 AA444369 ESTs 4.9 

124049 Hs.74519 F10523 primase;porypeptide2A(58kD) 4.8 

133330 Hs.71119 U42360 Human N33 mRNA; complete cds 4.8 

104776 AA026349 ESTs 4.8 

40 122593 Hs.128749 AA453310 Homo sapiens alpha-methylacyt-CoA racema 4.8 

103912 Hs.143087 AA251078 ESTs 4.8 

113961 Hs.26009 W86307 Homo sapiens mRNA for KIAA0860 protein; 4.8 

105288 Hs.3585 AA233168 ESTs; Weakly similar to coded for by C. 4.8 

135035 Hs.284186 H89575 ESTs 4.8 

45 104144 Hs.183390 AA447439 ESTs; WeaWy similar to ZINC FINGER PHOT 4.8 

129389 Hs.288126 AA621604 ESTs 4.8 

125982 R98091 RAE1 (RNA export 1 ; S.pombe) homolog 4.8 

125162 Hs^6243 W44682 ESTs 4.8 

103023 Hs.1 17950 X53793 multifunctional polypeptide similar to S 4.7 

50 129735 W80701 ESTs; WeaWy similar to HERV-E envelope 4.7 

104479 Hs.106390 N36040 ESTs 4J 

103731 AA070545 zm7c3.fi Stratagene neuroepithelium (#93 4.7 

126575 Hs.127602 W72416 ESTs - 4.7 

124578 Hs-231500 N68321 Human glucose transporter-like protetn-l 4.7 

55 130617 Hs.1674 M90516 glutamine^ructose^-phosphatetransamin 4.7 

116752 Hs.91622 H06373 Homo sapiens done 24456 mRNA sequence 4.7 

100279 Hs.82007 D42084 Human mRNA for KIAA0094 gene; partial cd 4.7 

126288 Hs.89576 AI479264 ESTs 4.7 

131836 Hs.32990 AA610086 ESTs 4.7 

60 106717 Hs.239489 AA465093 TIA1 cytotoxic granule-associated RNA-bi 4.7 

114542 Hs.91011 AA055768 ESTs 4.6 

103806 AA130614 zo1*2j1 Stratagene neuroeprthefium NT2R 4.6 

130529 AA173238 smaQ inducible cytoWne A5 (RANTES) 4.6 

115675 HSJ2065 AA406546 ESTs 4.6 

65 111386 Hi*293798 N95326 ESTs 4.6 

106503 Hs.29679 AA452411 ESTs 4.6 

119943 Hs.14158 W86835 copine II! 4.6 

104459 Hs.100070 M91493 EST 4.6 

100774 Hs.89603 HG371-HT1063 Mucin 1, Epithelial, Alt Splice 6 4.6 
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100652 Hs.142653 HG2825-HT2949 


Ret Transforming Gene 


4.6 


132015 Hs.3731 D11900 


ESTs 


4.6 


126086 H70975 


yr73g01.r1 Soares fetal liver spleen 1NF 


4.6 


130888 Hs.173094 F03819 


ESTs 


4.6 


106390 Hs.20166 AA446964 


Prostate stem ceD antigen 


4.6 


126959 M199853 


ESTs; Moderately similar to !!!! ALU SUB 


4.5 


131584 Hs.29117 X91648 


Ksapiens mRNA for pur alpha extended 3* 


4.5 


104838 HSJ20953 M039481 


ESTs 


4^ 


125661 R50319 


ESTs 


4.5 


103171 Hs234726 X68733 


alpha-1 -anfichymotrypsin 


45 


103928 Hs.199160 AA280085 


ESTs 


4J5 


102899 Hs.75730 X06272 


signal recognition particle receptor f d 


4.5 


100892 Hs.180789 HG4557-HT4962 Small Nuclear RbonudeoproteinUl.tsnr 


4.5 




CO IS 


45 


iOQATlA He "iMKRA AA179IVW 
1 nS.0 1 / OW nn 1 / <-UOO 


COI o 


4.5 


lUbyyu nS-24/oo AAD^l3o4 


CQTe 


4.5 


1 «it 0 1 0 nS.*f*tO DO UtOOO 1 


Mitman nmfPin immitnA-roartivft with antu 


4.4 


1 oiUJoo ns.oo I/O i oy ooo 


Hnrrtrt canlone mBMA for Y lAAOAflR nmtoin* 


4.4 


133718 rls.19o/o0 AiooOb 


neurofilament; heavy polypeptide (200kO) 


4.4 


mi^7n He IFLAfi MWfiQB 
1UI4/U rtS.lO40 M££Oyo 


h rmnr nrnloln nlW fl LPranmoni evnrtmmft\ 


4.4 


131904 HSu2B4Zyo AA1 43019 


co i s, nignry similar to surrace 4 miegr 


4.4 


lUooU4 ns^o!** AAoool**£ 


PCTc 
CO IS 


4.4 


122861 ns.1 19394 M464428 


colS 


4.4 


4 4 4 OOC Lie OOQOj* M70XCC 

111336 Hs2yoy4 N/yooo 


CCTe 
CO IS 


4.4 


121944 nsJBolo AM29278 


CCTe 
COlS 


4.4 


134401 nS.2l157/ AA243746 


ESTs; Highly similar to CG1 protein [Hj> 


4.4 


126458 Hs.2889o9 AA815252 


co 1 s, weawy simnar to iwi alu ouor ami 


4.4 


133435 HS.323ao6 TZ3983 


CCTo< %»nAnr*stnhi rlmtfer fn llll Al 1 1 CI ID 

co I s, Mooeraiefy simi/ar to uu alu oua 


4.4 


105178 HS.21941 AAl 87490 


COlS 


4.3 


127315 AA640B34 


nr27Duo ji noi_ UuAr_rio homo sapiens CUN 


4.3 


132645 Hs.54424 X87870 


H .sapiens mRNA for hepatocyte nuclear fa 


4.3 


4 4£4CO Li* OOOQQA AA/G1XQ7 

116162 HS^o2yyU AA4ol4o7 


Co i s>i vveamy simnar 10 ro^u i^jc iu .eiey 


4.3 


j jaaia II. >|7C£y7 MC407C 

118040 HS.47567 N52876 


CCT 

coT 


4.3 


4 4AnnO Ua 070X07 14*31 VlOQ 

130008 HS.27o4Z/ M31423 


cereoeiiar oegeneraiton-reiaiea proiein 


4.3 


126607 HS.1146B8 W87424 


colS 


4.3 


123061 Hs.1 05130 AA482030 


COT 

coT 


4.3 


109391 HS.184245 AA219699 


colS 


4.3 


A A4AAin A 

109175 M180496 


ESTs 


4.3 


127003 Hs.173540 AA550806 


cSTs, weaKiy simitar to (deiime noi ava 


4.3 


102547 Hs.46638 U5791i 


chromosome 1 1 open reading frame 8 


4.3 


134208 Hs.79993 U88871 


peroxisomal biogenesis factor 7 


4.3 


104258 Hsi>462 AF007216 


solute carrier family 4; sodium bicarbon 


4.3 


1*3/17*0 He 10Q4A AhOQATXi 

iou/oy ns.ioy*+o Mnuy4/^v 


CCTe* Woakhf cimilar tit HpfTlnfl nnt 
CO 1 S», VvcdlUj olillual vJ ^uoiillto IIUI d*a 


4.3 


132160 Hs.295923 AA281770 


seven in absentia (Drosophfla) homolog 1 


4.3 


135062 Hs.93872 AA174183 


ESTs 


4.3 


126510 Hs.334762 R49702 


ESTs; WeaWy similar to KIAA0319 [H^api 


42 


122055 Hs.98747 AA431732 


EST 


42 


133136 Hs.6574 AF007165 


suppressm (nuclear deformed epidermal a 


4.2 


109890 Hs.20843 H04649 


ESTs 


42 


133294 Hs.69997 R79723 


H.sapiens mRNA for transiin associated z 


42 


134436 Hs.83190 S80437 


tatty add synthase {3* region} [human, 


42 


107375 HS251064 U88573 


NBR2 


42 


122223 Hs27413 AA436158 


ESTs 


. 42 


103044 Hs.248210 X55777 


H .sapiens Mahlavu hepatocellular carcino 


42 


120125 Hs.59815 W99362 


EST 


42 


128969 Hs.283978 T65327 


ESTs; Highly similar to (defline not ava 


42 


129637 Hs.1 179 090359 


TATA box binding protein (TBP)-associate 


42 


106566 AA455921 


ESTs; WeaWy similar to III! ALU SUBFAMl 


42 


112605 Hs.29852 R79220 


ESTs 


42 


103364 Hs.279929 X90872 


H^apiens mRNA for #25L2 protein 


42 


132811 Hs.57419 U25435 


transaipUonal repressor 


4.2 


126570 Hs.326292 T79274 


ESTs 


4.2 


116298 Hs.94109 AA489046 


ESTs 


42 


103024 Hs.105938 X53961 


lactotransferrin 


4.1 


129133 Hs.108850 R56728 


yg95c6.r1 Soares infant brain 1NIB Homo 


4.1 


133167 Hs.6641 N98707 


kinesn family member 5C 


4.1 


126871 Hs.14051 AA351779 


ESTs 


4.1 


132333 Hs.45032 AA192157 


ESTs 


4.1 


107376 Hs.327179 U90545 


solute carrier farrcly 17 (sodium phospha 


4.1 
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128517 Hs.100861 AA280617 ESTs; Weakry similar to p60katanin[H.s 4.1 

130555 Hs.116774 AA450324 ESTs 4.1 

105765 Hs.24183 AA343514 ESTs 4.1 

126529 Hs.26369 AA133237 ESTs 4.1 

5 125928 Hs.181889 H29730 ESTs 4.1 

117280 Hs.172129 N22107 ESTs; Moderate V amQar to III! ALU SUB 4.1 

100234 Hs.3085 D29677 KIAA0054 gene product 4.1 

100959 Hs.1 18127 J00073 actin; alpha; cardiac muscle 4.1 

107130 Hs.12913 AA620582 ESTs; Weakly similar to (defline not ava 4.1 

10 105035 Hs.8859 AA128486 ESTs 4.1 

126735 Hs.226795 AA808949 glutathione S-transferase pi 4.1 

113056 Hs.8036 T26471 ESTs; Moderately similar to !!!! ALU SUB 4 

102460 Hs.211582 U48959 Homo sapiens myosin light chain kinase ( 4 

106968 Hs.26813 AA504631 ESTs; Weakly similar to (defline not ava 4 

15 123107 Hs.104207 AA486071 ESTs 4 

127256 Hs.267967 AA327550 ESTs; Weakly similar to HI! ALU SUBFAMI 4 

105329 Hs.22862 AA234561 ESTs 4 

115504 Hs.42736 AA291946 ESTs 4 

120726 Hs.97293 AA293656 ESTs 4 

20 103576 Hs.94560 Z26317 desmogtein2 4 

127889 Hs.144941 AI147408 ESTs 4 

106394 Hs.25320 AA447223 ESTs 4 

128046 AA873285 ESTs 4 

103391 Hs. 11 4366 X 94453 pyrroline-5-carboxylate synthetase (glut 4 

25 106448 Hs.27004 AA449455 ESTs 4 

126513 Hs.86276 W27601 ESTs; Moderately similar to (defline not 4 

129593 Hs.98314 AA487015 ESTs; Weakly similar to HI! ALU SUBFAMI 3.9 

110151 Hs.31608 H18836 ESTs 3.9 

105344 Hs.8645 AA235303 ESTs 3.9 

30 104791 Hs.301871 AA029046 ESTs 3.9 

123442 Hs.1 11496 AA598803 ESTs 3.9 

127800 Hs.79428 AA521047 BCL2/adenovinjsE1B 19kD-interacting pro 3.9 

114555 Hs.167904 AA058594 ESTs 3.9 

122138 Hs.163960 AA435549 ESTs 3.9 

35 129565 Hs.198726 X77777 vasoactive intestinal peptide receptor 1 3.9 

103471 Hs.75216 Y00815 protein tyrosine phosphatase; receptor t 3.9 

133908 Hs.325474 M83216 caldesmonl 3.9 

105635 Hs.301985 AA281508 ESTs 3.9 

134285 Hs.81086 AA460012 solute carrier family 22 (organic cation 3.9 

40 134125 Hs.50421 R381G2 KIAA0203 gene product 3.9 

125628 Hs.24t493 AA418069 natural killer-tumor recognition sequenc 3.9 

103695 Hs.186600 AA018758 ESTs 3.9 

100642 Hs.182183 HG2743-HT3926 Caldesmon 1, AIL Splice 6, Non-Muscle 3.9 

104334 Hs.78771 D82614 ESTs 3.9^ 

45 110242 Hs.19978 H26417 ESTs 3.9 

125298 Hs.289008 239255 ESTs 3.9 

104060 Hs^03193 AA397968 zt87a9.r1 Soares_testis_NHT Homo sapiens 3.9 

105823 Hs.293960 AA398197 ESTs 3.9 

126499 Hs.1 10445 AA315671 ESTs; Moderately similar to unknown (M.m 3.9 

50 130752 Hs.18895 D50927 KIAA0137 gene product 3.8 

123494 Hs.1 121 10 AA599766 ESTs 3.8 

104846 HS32478 AA040154 ESTs 3.8 

108921 Hs.71721 AA142913 ESTs - 3.8 

115506 Hs.45207 AA292537 ESTs 3.8 

55 100452 Hs.241552 D87742 Human mRNA for KIAAQ268 gene; partial cd 3.8 

104454 Hs.129228 M84443 gabctoWnase 2 3.8 

108730 Hs.102859 AA126254 ESTs 3.8 

131223 Hs34427 AA247788 ESTs; Highly similar to (defline not ava 3.8 

104784 Hs.269228 AA027055 ESTs 3.8 

60 104946 Hs.73848 AA069549 ESTs 3.8 

106932 Hs.9394 AA495926 ESTs 3.8 

101724 Hs.620 M69225 bullous pemphigoid antigen 1 (230/240kD) 3.8 

106140 Hs.14912 AA424524 Homo sapiens mRNA for KIAA0286 gene; par 3.8 

128135 Hs.269721 AA913491 ESTs 3.8 

65 120030 Hs.58694 W92051 ESTs 3.8 

126457 Hs.50382 AA007489 zh98g04/1 SoaresJetaLfoerjspleenJNF 3.8 

123917 Hs.1 12969 AA621311 EST 3.7 

110714 Hs.17752 H95978 Homo sapiens phosphatidytserme-speaTrc 3.7 

130577 Hs.162 M35410 insulin-Gke growth factor binding prote 3.7 
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117667 Hs.44708 N392U ser-Thr protein kinase related to the my 3.7 

126104 Hs.39712 N77278 • ESTs; Weakly similar to BONE/CARTILAGE P 3.7 

100379 Hs.278721 D82060 Homo sapiens mRNA for membrane protein w 3.7 

115646 Hs.305971 AA404352 ESTs 3.7 

5 125792 Hs.193700 AI005388 ESTs; Moderately simitar to I!!! ALU SUB 3.7 

102162 Hs.1592 U18291 CDC16 (cell division cycle 16; S. cerevi 3.7 

128530 Hs.183475 AA504343 ESTs; Moderately sJm2ar to !»! ALU SUB 3.7 

119940 Hs.272531 W86779 EST 3.7 

1 10769 Hs23837 N22222 yw34b06.s1 Morton Fetal Cochlea Homo sap 3.7 

10 132914 Hs.60293 AA496037 ESTs 3.7 

113594 Hs.15683 T92030 ESTs 3.7 

103702 Hs279952 AA027793 ESTs; Highly similar to (defline not ava 3.7 

130780 Hs. 19347 AA248406 ESTs 3.7 

123288 HS291Q25 AA495836 EST 3.7 

15 120691 HS22380 AA291173 ESTs 3.7 

103153 Hs.75295 X66534 " guanytate cyclase 1; soluble; alpha 3 3.7 

129201 Hs.109390 H19969 ESTs 3.7 

114798 Hs54900 AA159181 ESTs 3.7 

126801 Hs.7337 AA512902 ESTs 3.7 

20 105503 HS31707 AA256616 ESTs 37 

104260 Hs.194283 AF008192 Homo sapiens putative GR6 protein (GR6) 3.7 

125980 Hs.35699 R97219 ESTs 3.7 

123255 Hs.105273 AA490890 ESTs 3.6 

103862 Hs.6363 AA206625 ESTs 3.6 

25 100696 Hs.121686 HG3162-HT3339 Transcription Factor lia 3.6 

134917 Hs.166994 X87241 FAT tumor suppressor (Drosophaa) homolo 3.6 

103520 Y10511 H^apiens mRNA for C0 176 protein 3.6 

113778 Hs.302738 W15263 ESTs 3.6 

101838 Hs.75511 M92934 connective tissue growth factor 3.6 

30 113702 T97307 ESTs; Moderately similar to Ml ALU SUB 3.6 

118201 Hs.48428 N59800 EST 3.6 

116519 Hs.68554 C20780 EST 3.6 

105886 Hs.22983 AA400517 ESTs; Moderately similar to UDP-GLUCOSE: 3.6 

106709 Hs.170291 AA464696 ESTs 3.6 

35 127858 HSJ27973 AA806365 oc26h07.s1 NCLCGAP_GCB1 Homo sapiens cD 3.6 

101964 S81578 oToxin-responsive gene {putative polyade 3.6 

105508 Hs.326416 AA256680 ESTs 3.6 

116844 Hs.337434 H64938 ESTs 3.6 

105372 Hs.142296 AA236481 ESTs 3.6 

40 100745 Hs.144630 HG3510-HT3704 V-Erba Related Ear-3 Protein 3.6 

127521 Hs.164018 AA809982 ESTs 3.6 

110758 Hs.274265 N21365 talin 3.6 

107307 Hs.44155 T52099 creatine kinase; mitochondrial 2 (sarcom 3.6 

133200 Hs.183639 AA432248 ESTs 3.6 

45 114774 Hs.184325 AA150043 ESTs 3.6 

120265 Hs.270696 AA1 73759 ESTs; Moderately similar to !!!! ALU SUB 3.6 

134359 Hs.199067 M34309 v-en>b2 avian erythroblastic leukemia v 3.6 

116250 Hs.44829 AA480975 ESTs; Moderately similar to III! ALU SUB 3.6 

106313 Hs.35841 AA436459 nuclear factor !/X(CCAAT-bindingtransc 3.6 

50 131898 Hs.279780 N52232 ESTs 3.6 

133444 Hs73793 M27281 vascular endothelial growth factor 3.6 

128232 Hs.334641 H06296 ESTs 3.6 

135357 Hs.79572 AA235803 ESTs - 3.5 

457951 AI369384 arytsulfatase D 35 

55 108407 AA075519 zm87h9.s1 Stratagene ovarian cancer (#93 3.5 

126659 T16245 adisintegrineiTdnwtafloprotemasedoma 3.5 

104189 Hs.301804 AA485805 ESTs 3.5 

125956 Hs.129014 N53276 ESTs 3.5 

103026 Hs.79386 X54162 Human mRNA for a 64 Kd autoantigen expre 35 

60 133011 Hs.171921 AA042990 sema domain; immunoglobulin domain (Ig); 3.5 

131379 Hs^6176 R49035 ESTs 35 

126742 Hs.169359 H64106 yr57e06.r1 Soares fetal fiver spleen INF 3.5 

105560 Hs.306915 AA262783 ESTs 35 

118472 Hs.42179 N66818 ESTs 3.5 

65 105623 Hs.30127 AA280895 ESTs; Highly similar to !!!! ALU SUBFAMI 35 

120262 Hs.145807 AA172076 ESTs; Moderately similar to fll! ALU SUB 35 

105027 HS26771 AA126472 ESTs 35 

130760 Hs.18953 AA126997 phosphodiesterase 9A 35 

117473 HS.155560 N30157 ESTs 35 
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1Q2663 Hs.168075 U70322 karyopherin (importm) beta 2 3.5 

126349 Hs.13531 AM42868 ESTs; Weakly similar to (defline not ava 35 

132154 Hs.41119 N67179 ESTs 3.5 

131689 Hs.30696 AA599653 transcription factor-Bke 5 (basic helix 3.5 

5 127862 Hs.163191 AA765305 EST 35 

126995 Hs.189810 W26950 Human DNA sequence tram PAC 388M5 on chr 35 

119071 R31180 ESTs 35 

103941 Hs.96593 AA282978 ESTs 35 

110721 HSL31319 H97678 ESTs 35 

10 126586 Hs.43086 AA011247 ESTs 35 

103106 Hs.1857 X62025 phosphodiesterase 6G; cGMP-specific; rod 35 

116357 Hs.90797 AA504806 Homo sapiens done 23620 mRNA sequence 35 

105309 Hs.4104 AA233790 ESTs 35 

130796 Hs.19525 R39390 ESTs 35 

15 109101 Hs.52184 AA167708 ESTs 35 

103134 Hs.2839 X65724 Nome disease (pseudoglioma) 35 

131798 Hs.301449 X86098 adenovirus 5 El A binding protein 35 

118535 Hs.49418 N67966 ESTs 35 

102592 Hs.11223 U62389 Human putative cytosolic NADP-dependent 3.4 

20 125905 Hs.6456 T69868 chaperonin containing TCP1;subunit 2 (b 3.4 

109160 Hs.301997 AA179387 ESTs 3.4 

105327 Hs.211593 AA234440 ESTs 3.4 

106586 HS57787 AA456598 ESTs 3.4 

122635 AA454085 EST 3.4 

25 132413 Hs2601 16 AA132969 metalloprotease 1 (pitrilysin family) 3.4 

131938 HS34956 AA283620 ESTs 3.4 

133871 Hs.182793 AA454597 ESTs 3.4 

107175 Hs.292503 AA621751 ESTs; Weakly similar to KIAA0601 protein 3.4 

101188 Hs.184298 L20320 cycGn-de pendent kinase 7 (homolog of Xe 3.4 

30 126422 Hs£37658 H48518 ESTs; Highly similar to apoiipoprotein A 3.4 

118475 N66845 ESTs; WeaWy similar to HH ALU CUSS B 3.4 

104558 Hs.88959 R56678 ESTs; Weakly similar to !!!! ALU SUBFAMI 3.4 

128307 Hs.132005 AI453794 ESTs 3.4 

112254 Hs.25829 R51831 ESTs 3.4 

35 125408 Hs.89578 N72353 yv37e12j1 Scares fetal Gver spleen 1NF 3.4 

109834 Hs.175955 H00604 ESTs 3.4 

130844 Hs.20191 D12122 seven in absentia (Drosophila) homolog 2 3.4 

127143 Hs.20843 AA533553 nI68h04.s1 NCLCGAP_Pr10 Homo sapiens cD 3.4 

135309 Hs.42500 D25984 ESTs 3.4 

40 125724 Hs.295978 AA083407 stimulated trans-acting factor (50 kDa) 3.4 

127692 Hs.187983 AI021912 ESTs 3.4 

116674 Hs.92127 F04816 ESTs 3.4 

134700 Hs.8868 AA481414 golgi SNAP receptor complex member 1 3.4 

114846 Hs.166196 AA234929 ESTs 3.4 

45 103649 Hs.155983 Z70219 Rsaplens mRNA for 5'UTR for unknown pro 3.4 

134835 Hs.89925 L04569 calcium channel; voltage-dependent; L ty 3.4 

130568 Hs.16085 AA232535 ESTs; Highly similar to (defline not ava 3.4 

111331 Hs.15978 N78773 ESTs 3.4 

106036 Hs.10653 AA412505 ESTs 3.4 

50 130987 Hs.21893 R45698 ESTs 3.4 

112814 Hs.35828 R98192 ESTs 3.4 

127815 Hs^55015 AA876009 ob93c!0.s1 NCI_CGAP_GC81 Homo sapiens cO 3.4 

100144 Hs.75616 D13643 KIAA0018 gene product - 3.4 

101129 Hs247992 L10405 Homo sapiens DNA binding protein for sur 3.4 

55 130874 Hs.20621 T08287 ESTs 3.4 

106882 Hs.26994 AA489009 ESTs 3.4 

103855 Hs.302267 AA195179 ESTs 3.4 

125957 H45213 yo03b08.r1 Soares adult brain N2b5HB55Y 3.3 

114048 Hs.146085 W94613 ESTs 3.3 

60 109826 Hs.75354 F13702 ESTs 3.3 

125355 Hs.170098 R45630 ESTs; Highly similar to KIAA0372 (H.sapi 3.3 

104182 Hs.143792 AA479990 ESTs; Weakly similar to glioma amplified 3.3 

100294 Hs.75454 D49396 Human mRNA for ApolJ Human (MER5(Aop1-Mou 3.3 

131688 Hs.30692 U24153 p21 (CDKN1A)-activated kinase 2 3.3 

65 116256 Hs.88201 AA481256 ESTs; Weakly similar to (deffine not ava 3.3 

102034 Hs.230 U05291 fibromoduHn 3.3 

130072 Hs.14658 R99606 Human chromosome 5qI3.1 clone 5G8 mRNA 3.3 

1 14615 Hs.159456 AA083812 ESTs; Highly similar to (defline not ava 3.3 

128707 Hs.104105 AA136474 Meis (mouse) homolog 2 3.3 
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115048 Hs.190057 AA252668 ESTs 3.3 

125862 Hs.31110 H12084 ESTs 32 

135142 Hs24192 R31679 ESTs 3.3 

103119 Hs2877 X63629 cadherin 3; P-cadherin (placental) 3.3 

5 104460 Hs.62604 M91504 ESTs 3.3 

100365 Hs.79284 078611 mesoderm specific transcript (mouse) hom 3.3 

131524 Hs.301804 N39152 ESTs 3.3 

102165 Hs.159627 U18321 Death associated protein 3 3.3 

126966 Hs.182575 R38438 sotute carrier family 15 (H+/peptide tra 3.3 

10 124839 Hs.140942 R55784 ESTs 3.3 

100709 Hs.100469 HG3264-HT3441 Af-6 (Gb:U02478) 32 

132967 Hs.61635 AA032221 Homo sapiens BAC done RG041 D1 1 from 7q2 3.3 

102927 Hs.65114 X12876 keratin 18 3.3 

132616 Hs283558 AA386264 ESTs 3.3 

15 125132 HS.129781 W15495 ESTs 3.3 

111225 Hs21652 N68989 ESTs 3.3 

114956 HSJ7113 AA243681 ESTs 3.3 

122235 Hs.1 12227 AA436475 ESTs 3.3 

112325 Hs.12315 R56055 ESTs 3.3 

20 123360 Hs.178604 AA504784 ESTs 3.3 

105150 Hs.155995 AA169640 Homo sapiens mRNA for KIAA0643 protein; 3.3 

107391 Hs284294 W02877 ESTs 3.3 

113058 Hs.7569 T26893 EST 3.3 

134371 Hs.82318 S69790 Brush-1 3.3 

25 125669 Hs.333256 R51308 ESTs; Moderately similar to III! ALU SUB 3.3 

111506 HS294105 R07726 ESTs 3.3 

122974 Hs.194215 AA478625 ESTs 3.3 

102369 Hs299867 U39840 hepatocyte nuclear factor 3; alpha 3.3 

120408 Hs.190151 AA235045 ESTs 3.3 

30 117993 Hs.47402 N52039 ESTs; Weakly similar to !!!! ALU SUBFAMI 3.3 

129586 Hs.1 1500 AA437118 ESTs 3.3 

128138 Hs.126494 AI200825 ESTs 3.3 

127265 AA332751 EST37214 Embryo, 8 week I Homo sapiens c 3.3 

107674 Hs.41143 AA011027 Homo sapiens mRNA for KIAA0581 protein; 32 

35 104866 Hs.293691 AA045342 ESTs 3.2 

103427 Hs-250655 X97303 Rsapiens mRNA for Ptg-12 protein 32 

132990 Hs.334334 AA458761 ESTs 32 

127017 Hs251946 AA740146 ESTs 32 

132313 Hs.44481 U13220 forkhead (DrosophilaHke 6 32 

40 106880 Hs.32425 AA488889 ESTs 3.2 

107039 Hs.169780 AA599751 homologous to yeast nitrogen permease (c 32 

120870 Hs.292581 AA357172 ESTs 3-2 

107920 Hs.284207 AA027951 ESTs 32 

104165 Hs.105116 AA459160 EST 32 

45 107012 Hs.63908 AA598745 ESTs 32 

103605 Hs.194857 Z354G2 Rsapiens gene encoding E-cadherin, exon 32 

124006 HSJ270016 D603Q2 ESTs 3.2 

101300 Hs.74137 L40391 Homo sapiens (clone s153) mRNA fragment 32 

101183 Hs.795 L19779 H2Ahistone family; member 0 32 

50 125596 R25698 yg44h1 1 .r2 Soares infant brain 1 NIB Homo 32 

127261 AA661567 nu86b02.s1 NCI_CGAP_Atv1 Homo sapiens cO 32 

120090 Hs.59554 W94591 ESTs 32 

129393 Hs.166982 D13435 phospha&Jylinoslroi glycan; class F - 32 

120923 Hs.97129 AA382283 ESTs 32 

55 118907 HS274255 N91003 ESTs 32 

111552 Hs.191185 R09411 ESTs 32 

104431 Hs.99913 J03019 adrenergic; beta-1-; receptor 32 

133551 Hs278634 D63480 Human mRNA for KIAA01 46 gene; partial cd 32 

131615 Hs.192803 D14533 xeroderma pigmentosum; complementation g 32 

60 126547 Hs.84072 U47732 transmembrane 4 superfamiry member 3 32 

103172 Hs.1 16774 X68742 integrin; afpha 1 32 

113867 Hs24095 W68845 ESTs 32 

133323 Hs.70937 Z83735 H3 histone family; member K 32 

111597 Hs.189716 R11499 ESTs 32 

65 121515 Hs.104696 AA412133 ESTs 32 

107445 Hs.6639 W28406 ESTs 32 

106887 Hs.334335 AA489091 ESTs 32 

123052 Hs.185766 AA481806 ESTs 32 

107072 Hs.130760 AA6091 13 Homo sapiens mRNA; cDNA DKF2p586N0318 {f 32 
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102214 Hs.32964 U23752 SRY (sex-determining region Y)-box 11 3.2 

123147 AA487961 ab11h6.s1 Stratagena lung (#93721) Homo 3.2 

125435 Hs272138 R00940 ye87g03.r1 Soares fata! liver spteen 1NF 32 

116246 Hs250646 AA479981 ESTs; Highly similar to ubiquil^^ .32 

5 105169 Hs.180789 AA180321 Homo sapiens (done S 164) mRN A; 3* end o 32 

134001 Hs.78344 AF001548 myosin; heavy polypeptide 11; smooth mus 3.2 

124866 Hs.304389 R68571 ESTs 32 

133205 Hs.67619 AA089559 Homo sapiens mRNA; chromosome 1 specific 32 

102986 Hs.182378 X17648 colony stimulating factor 1 (macrophage) 32 

10 101232 Hs242894 128997 ADP-ribosylation factor-tike 1 3.1 

132906 Hs234896 AA142857 ESTs; Highly similar to geminin {H.sapie 3.1 

104281 Hs-5669 C14290 ESTs 3.1 

123926 Hs227933 AA621348 ESTs; Highly similar to (defline not ava 3.1 

134464 Hs239720 N79354 ESTs; Weakly similar to Rga [D.melanogas 3.1 

15 105322 Hs.16346 AA234100 ESTs 3.1 

100631 Hs.48332 HG2709-HT2805 Serine/Th/eonine Kinase (Gb 225431) 3.1 

130791 Hs.199263 AA259102 ESTs; Highly similar to (defline not ava 3.1 

131220 Hs.300855 R77200 ESTs 3.1 

113237 Hs.123642 T62857 ESTs 3.1 

20 125562 Hs.98968 AM94372 ESTs 3.1 

134110 HsJ9136 U41060 Human breast cancer, estrogen regulated 3.1 

132393 Hs.47334 W85888 ESTs; Moderately similar to II!! ALU SUB 3.1 

107439 Hs296842 W27995 ESTs; Moderately similar to non-muscle m 3.1 

125863 Hs.40719 AA299096 Homo sapiens mRNA; cDNA DKFZp564M0916 (f 3.1 

25 105811 Hs.286192 AA394121 ESTs 3.1 

1292B4 Hs296141 AA104023 ESTs 3.1 

125321 Hs.178294 T86652 ESTs 3.1 

107332 Hs.183297 T87750 ESTs 3.1 

123570 Hs.109653 AA608955 ESTs 3.1 

30 100384 Hs.90800 D83646 matrix metalloproteinase 16 (membrane-in 3.1 

109063 Hs^8972 AA161043 tetraspanl 3.1 

133284 Hs.182828 U09367 zinc linger protein 136 (done pHZ-20) 3.1 

131839 Hs.33010 H80622 Homo sapiens mRNA for K1AA0633 protein; 3.1 

117606 Hs.44698 N35115 ESTs 3.1 

35 418998 Hs287849 F13215 ESTs 3.1 

125180 Hs.103120 W58344 ESTs 3.1 



I00789 HG3893-HT4163 Phosphoglucomutase 1, All Splice 3.1 



126017 Hs.159440 H60487 ESTs 3.1 

132452 Hs247324 AA005262 Homo sapiens DN A sequence from PAC 262D1 3.1 

40 129077 Hs.108479 H78886 ESTs 3.1 

126563 Hs.181368 W26247 U5 snRNP-specific protein (220 kD); orth 3.1 

129650 Hs.1 18258 N52554 ESTs 3.1 

123465 AA599033 ESTs 3.1 

126486 Hs.152316 AA345339 EST51345 Gall bladder II Homo sapiens cO 3.1 

45 126460 Hs.167031 W01616 za36d05.r1 Soares fetal Over spleen 1NF 3.1 

118697 Hs.43234 N72094 ESTs 3.1 

103860 Hs.38057 AA203742 ESTs 3.1 

127868 Hs.124347 AA971439 ESTs 3.1 

124984 Hs223241 T47566 yb15c1U1 Stratagene placenta (#937225) 3.1 

50 103903 Hs.15220 AA249334 j312.seq.F Human fetal heart, Lambda ZAP 3.1 

106697 Hs.22242 AA463737 ESTs 3.1 

130892 Hs20993 AA442604 ESTs; Weakly similar to Ydr374cp (S.cere 3 

114032 Hs.35014 W92779 ESTs * 3 

128835 Hs.106390 W 1552 8 ESTs 3 

55 103657 Hs247815 Z80788 Ksaptens H4/1 gene 3 

126264 Hs250614 N42897 yy13h06.r1 Soares melanocyte 2NbHM Homo 3 

132626 Hs21275 D25755 ESTs 3 

131107 Hs.75354 N87590 ESTs 3 

126780 Hs.5811 R12421 ESTs 3 

60 127363 HS22116 AA307744 Homo sapiens Cdc 1481 phosphatase mRNA; c 3 

103690 Hs.84063 AA016186 ESTs 3 

102589 Hs.8867 U62015 Homo sapiens Cyr61 mRNA, complete cds 3 

125144 HS24336 W37999 ESTs 3 

132977 Hs.301404 U28686 RNA binding motif protBBi 3 3 

65 120714 Hs.146170 AA292689 ESTs 3 

101038 Hs.79411 J05249 replication protein A2 (32kD) 3 

102856 Hs248177 X00090 Human histone H3 gene 3 

105516 Hs^0738 AA257971 ESTs 3 

131137 Hs.33287 U85193 nuclear factor l/B 3 
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127221 Hs^41551 AI354332 ESTs 3 

411888 Hs.24104 R26708 ESTs 3 

131684 Hs.3066 U26174 granzyme K (serine protease; granzyme 3; 3 

100629 Hs.21291 HG2706-HT28Q2 Serine/Threonine KMase (Gbi25428) 3 

5 119944 Hs^8915 W86838 EST 3 

113801 Hs.118281 W38418 zinc finger protein 266 3 

133780 Hs.76152 M14219 deconn 3 

104690 Hs.14449 AA010889 ESTs 3 

126371 Hs.304139 N57645 EST 3 

10 127635 Hs.1 16346 AA766903 ESTs 3 

128434 Hs.143880 AI190914 ESTs 3 

435761 Hs.187555 AA701941 ESTs 3 

125025 Hs.50748 T71561 ESTs 3 

124940 Hs.1 03804 R99599 heterogeneous nuclear ribonucleoprotein 3 

15 128742 Hs.251531 D00763 proteasome (prasome; macropain) subunit; 3 

107147 Hs.10450 AA621125 Homo sapiens chrornosorne 2; 10 repeat reg 3 

112068 HS22545 R43910 ESTs 3 

105346 HS263727 AA235465 ESTs; Moderately similar to UU ALU SUB 3 

130972 Hs.21739 AA370302 HorrosapiensmRNA;cDNADKFZp586l1518{f 3 

20 131230 Hs.274407 AA14S987 thymus specific serine peptidase 3 

133743 Hs.75847 N79435 ESTs 3 

127402 Hs^27949 AA358869 ESTs; Highly s&nilar to SEC13-RELATED PR 3 

117483 Hs.44189 N30426 ESTs 3 

123659 Hs.1 12699 AA609368 ESTs 3 

25 103963 Hs.63290 AA298588 EST114219 HSC172ceBs II Homo sapiens c 3 

103795 Hs.7367 AA1 12222 ESTs; Moderately similar to (deflme not 3 

115092 Hs.80975 AA255903 CD39*like4 2.9 

134831 Hs.89890 S72370 pyruvate carboxylase 2.9 

128579 Hs.101810 AA093378 ESTs; Weakly similar to !!!! ALU SUBFAMI 2.9 

30 134193 Hs.7980 F09570 ESTs 2.9 

123522 Hs.1 12575 AA608577 ESTs 2.9 

107109 HS32793 AA609943 ESTs 2.9 

134694 Hs.88556 D50405 histone deacetylase 1 2.9 

134399 Hs.82689 H99801 tumor rejection antigen (gp96) 1 2.9 

35 134632 Hs.174139 AA398710 H. sapiens RNA for CLCN3 2.9 

106683 Hs.14512 AA461495 ESTs 2.9 

108555 AA084963 zn13e12.s1 Stratagene hWT neuron (#93723 2.9 

100953 Hs.2110 HG945-HT945 Nucleic Acid-Binding Protein (Gb:L12693) 2.9 

130597 Hs.1 6492 AA173998 ESTs; Weakly similar to weakly similar t 2.9 

40 101813 Hs.139226 M87338 replication factor C (activator 1)2 (40 2.9 

106636 Hs.286 AA459950 ESTs 2.9 

129109 Hs.108708 AA491295 calcium/caimodulin-dependent protein kin 2.9 

125819 Hs.251871 AA044840 stromal cell-derived factor 1 2.9 

106282 Ks.9857 AA433946 ESTs; Weakly similar to (defline not ava 2.9 

45 100386 Hs301636 D83703 peroxIsomaJ biogenesis factor 6 2.9 

114546 Hs.98074 AA056263 ESTs; Moderately similar to Oil ALU SUB 2.9 

105914 Hs.9701 AA4Q2224 Homo sapiens growth arrest and DNA-damag 2.9 

108552 AA084912 zn11c7.$1 Stratagene hNT neuron (#937233 2.9 

126505 Hs.190057 W26894 16a1 1 Human retina cDNA randomly primed 2.9 

50 134098 Hs.79086 X06323 Human MRL3 mRNA for ribosoma! protein L3 2.9 

129721 Hs.211539 L19161 eukaryotic translation initiation factor 2.9 

100076 Hs.277422 AB000897 Homo sapiens mRNA for cadherin FIB3, par 23 

117466 Hs.44104 N29862 ESTs - 2.9 

106335 Hs.36688 AA437258 ESTs; Moderately similar to WAP four-dis 2.9 

55 134510 Hs.250870 U25265 protein kinase; mitogen-activated; Wnas 2.9 

105835 Hs.32995 AA398412 ESTs 2.9 

106611 Hs.26267 AA458904 ESTs; Weakly similar to torsinA [H.sapie 2.9 

134087 Hs.173824 U51166 thymine-DNA glycosylase 2.9 

100641 Hs.182183 HG2743-HT2846 CaUesmonl, Alt Splice 4, Non-Muscle 2.9 

60 104802 R86920 ESTs 2.9 

117203 Hs.42738 H99799 ESTs 2.9 

131889 Hs.34073 AA401912 BH-protocadherin (brain-hearl) 2.9 

101707 Hs.155212 M65131 methylmalonyl Coenzyme A mutase 2.9 

115271 Hs.5724 AA279422 ESTs 2.9 

65 125812 Hs.287912 H73420 lectin; mannose-blnding; 1 2.9 

110740 Hs.19762 H99675 ESTs 2.9 

103406 Hs285728 X95677 H^aplens mRNA for ArgBPIB protein 2.9 

. 104577 Hs.132390 R7t539 ESTs 2.9 

102772 Hs.161002 U831 15 absent in melanoma 1 2.9 
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131710 Hs.30985 AA233225 EST* Highly similar to (deffine not ava 2.9 

125231 Hs.268903 W84714 ESTs 2.9 

127380 Hs.15535 A1417137 Homo sapiens clone 24582 mRNA sequence 2.9 

104229 Hs.61289 AB002346 inositol phosphate 5 , -phosphatase 2 (syn 2.9 

5 126600 Hs.191385 AA699949 ESTs 2.9 

125175 Hs.303030 W52355 EST 2.9 

103849 Hs.34578 AA187045 ESTs; Weakly similar to t!!i ALU SUBFAMl 2.9 

102126 Hs.78961 U14575 protein phosphatase 1; regulatory (inhib 2.9 

124906 Hs.107815 R87647 ESTs 2.9 

10 131148 Hs.303125 C00038 ESTs 2.9 

123158 Hs.218329 AA488658 heat shock 70kD protein 1 2.9 

133667 Hs.75462 U72649 Human BTG2 (BTG2) mRNA; comptete cds 2.9 

105182 Hs.18271 AA191014 ESTs; Weakly similar to Ydr372cp [S.cere 2.9 

133968 Hs.232068 D 15050 Human mRNA for transcription factor AREB 2.9 

15 117425 Hs.336901 N27154 ESTs 2.9 

111087 Hs.37637 N59645 ESTs 2.9 

129641 Hs.11805 N66066 ESTs 2.9 

128639 Hs.102897 N91246 ESTs 2.9 

133209 Hs.79265 AA1 14183 ESTs; Moderately similar to gtutamate py 2.9 

20 135154 Hs.267812 AA126433 sorting nexin 4 25 

126838 Hs-279609 AA858097 pigment epithefium-derived factor 2.9 

103803 Hs.106149 AA127696 ESTs 2.9 

102139 Hs.2128 U15932 dual specificity phosphatase 5 2.9 

128104 AA971000 op67g1 1.s1 Soares_Na_T_GBC_S1 Homo sapi 2.8 

25 127834 Hs.337631 AA761415 nz22d08.s1 NCLCGAP_GCB1 Homo sapiens cD 2.8 

133101 Hs.180952 AA488230 ESTs 2.8 

127250 Hs.217916 AI023717 ESTs 2.8 

135063 Hs.93883 D10537 myelin protein zero (Charcot-Marie-Tooth 2.8 

126323 Hs.68644 N45014 yy80g06.r1 Soares_muItiple_sderosis_2Nb 2.8 

30 121873 Hs.145696 AA426270 ESTs 2.8 

122090 Hs.98684 AA432141 ESTs 2.8 

118728 Hs.322645 N73705 ESTs 2.8 

135400 Hs.99915 M23263 androgen receptor (dihydrotestosterone r 2.8 

125278 Hs.129998 W93523 ESTs 2.8 

35 124387 Hs.109019 N27637 ESTs 2.8 

124803 Hs.12186 BA5480 cyciinK 2.8 

H45968 Hs.32149 H45968 ESTs 2.8 

104261 Hs.5409 AF008442 RNA polymerase I subunit 2.8 

105366 Hs£82093 AA236356 ESTs 2.8 

40 106070 Hs.5957 AA417761 Homo sapiens done 24416 mRNA sequence 2.8 

131356 Hs£5960 M13241 v-myc avian myelocytomatosis viral relat 2.8 

112009 HS26255 R42714 EST 2.8 

133199 Hs.250175 AA609773 Homo sapiens clone 23904 mRNA sequence 2.8 

110379 Hs.33130 H44825 ESTs 2-8 

45 103890 Hs.72085 AA236843 ESTs; Weakly simitar to unknown [Sxerev 2.8 

128152 R20353 yg20f10.r1 Soares infant brain 1NIB Homo 2.8 

107008 Hs.23740 AA598710 ESTs 2.8 

135243 Hs.97101 AA215333 ESTs 2.8 

103058 Hs.184510 X57348 stratifin 2.8 

50 132020 Hs.293845 AA428990 ESTs 2.8 

116354 Hs.292566 AA504262 ESTs 2.8 

125867 Hs.12372 H98141 ESTs 2.8 

120603 Hs.98541 AA282787 ESTs; Highly similar to (defline not ava • 2.8 

115119 Hs.46847 AA256524 Human DNA sequence from done 30M3 on ch 2.8 

55 133865 Hs.170290 F09315 discs; large (Drosophila) homolog 5 2.8 

109415 Hs.1 10826 AA227219 Homo sapiens CAGF9 mRNA; partial cds 2.8 

128687 Hs.23767 Z38910 ESTs 2.8 

109984 Hs.10299 H09594 ESTs; Moderately similar to HI! ALU SUB 2.8 

133179 Hs.66731 U81599 homeoboxB13 2.8 

60 115998 Hs.336629 AA448488 ESTs; Weakly similar to zinc finger prot 2.8 

112180 Hs.25067 R49116 EST 2.8 

120428 Hs.173694 AA236822 ESTs; Moderately similar to (defline not 2.8 

106241 Hs.6019 AA430108 ESTs 2.8 

131060 Hs.22564 AA160890 myosin VI 2.8 

65 111383 Hs.40919 N94527 ESTs 2.8 

102123 Hs.1594 U14518 centromere protein A (17kD) 2.8 

102722 Hs.79981 U79242 Human dona 23560 mRNA sequence 2.8 

129887 Hs^74324 W92041 PCAF associated factor 65 alpha 2.8 

126663 Hs.181297 AA714635 ESTs 2.8 
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104367 Hs.134342 H17438 ESTs; Weakly similar to seventransmembra 2.8 

107316 Hs.193700 T63174 ESTs; Moderately similar to HI! ALU SUB 2.8 

128059 Hs.145096 AA972446 ESTs 2.8 

124447 N48000 ESTs 2.8 

5 111398 Hs.125565 R00086 deafness; X-linked1; progressive 2.8 

134085 Hs.79018 U20979 chromatin assembly factor I (150 kDa) 2.8 

124788 Hs.100912 R43543 ESTs 2.8 

112248 Hs.326416 R51361 ESTs 2.8 

121309 Hs.97312 AA402482 ESTs 2.8 

10 103076 Hs.75319 X59618 ribonucleotide reductase M2 polypeptide 2.8 

107071 Hs.35198 AA60S053 ESTs 2.8 

104425 Hs.35380 H88496 ESTs 2.8 

132991 Hs.62245 AA446906 solute carrier family 25 (mitochondrial 2.8 

104968 Hs.29669 AA084602 ESTs 2.8 

15 121153 Hs.97694 AA399640 ESTs 2.8 

131216 Hs.243901 D31058 ESTs 2.8 

109682 HS22869 F09299 ESTs £8 

131990 Hs.168818 H77734 ESTs; Moderately similar to roundabout 1 2.8 

132027 Hs.181444 N78844 ESTs; Weakly similar to R12C12.6 [C.eleg 2.8 

20 127383 Hs.190478 AA447990 ESTs 2.8 

132598 Hs.530 M81379 collagen; type IV; alpha 3 (Goodpasture 2.8 

101121 Hs.1313 L09753 tumor necrosis factor (ligand) superfami 2.8 

123000 Hs.105640 AA479347 ESTs 2.8 

121329 Hs.1755 AA404324 ESTs 2.8 



25 100481 Hs.121489 HG1098-HT1098 CystatinD 2.7 



113803 Hs.283683 W42789 ESTs 2.7 

110934 Hs.169001 N48708 ESTs; Weakly similar to cytochrome P450 2.7 

432888 T86823 ESTs 2.7 

121802 Hs.188898 AA424328 ESTs 2.7 

30. 130396 Hs.155313 AB002331 Human mRNA for KIAA0333 gene; partial cd 2.7 

121103 Hs.97697 AA398936 ESTs; Weakly similar to (defline not ava 2.7 

131129 Hs.23240 R27296 ESTs 2.7 

130943 Hs.272429 050855 calcium-sensing receptor (hypocalduric 2.7 

134676 Hs.87819 W28051 ESTs; Weakly similar to keratin 9; cytos 2.7 

35 111900 Hs.25318 R39044 ESTs 2.7 

106025 Hs.173334 AA412063 ESTs 2.7 

126144 Hs.40639 N39696 yx92a07.r1 Soares melanocyte 2NbHM Homo . 2.7 

103248 Hs.75262 X77383 cathepsinO 2.7 

127230 Hs.274170 H30501 Homo sapiens Opa-Interacting protein OIP 2.7 

40 101584 Hs.84072 M35252 transmembrane 4 superfamily member 3 2.7 

124131 Hs.167489 H19980 ESTs 2.7 

129689 Hs.77873 AA130156 ESTs 2.7 

132892 Hs.9973 W92797 ESTs 2.7 

120827 Hs.132967 AA347717 ESTs 2.7 

45 134579 Hs.85963 N23222 ESTs; Moderately similar to Hi! ALU SUB 2.7 

106149 Hs.256301 AA424881 ESTs 2.7 

132037 Hs.332541 AA203649 ESTs; Weakly similar to HEM45 [H^apiens 2.7 

130542 Hs.179825 U64675 Human sperm membrane protein BS-63 mRNA, 2.7 

122851 Hs.99598 AA463627 ESTs 2.7 

50 134983 Hs.196384 D28235 prostaglandin-endoperoxide synthase 2 (p 2.7 

120537 Hs.160422 AA262790 ESTs 2.7 

131036 Hs.174140 X64330 ATP citrate lyase 2.7 

133889 Hs.211582 AA099391 ESTs * 2.7 

128847 Hs.106529 AA424199 zv81e01.r1 Soares_total_fetus_Nb2HF8_9w 2.7 

55 112755 Hs.306044 R93802 ESTs 2.7 

423239 AA323591 EST26392 Cerebellum il Homo sapiens cDNA 2.7 

105031 Hs.12321 AA1 27240 ESTs 2.7 

126021 Hs.187516 AA775894 ESTs 2.7 

102116 U13706 Human ELAV-like neuronal protein 1 isofo 27 

60 133394 Hs.237225 R16759 ESTs; Weakly similar to (defline not ava 2.7 

104267 Hs.278439 C00358 ESTs 2.7 

107614 Hs.40241 AA004878 ESTs; Highly similar to (defline not ava 2.7 

129809 Hs.1259 X55283 asiakigrycoproteln receptor 2 2.7 

112109 Hs.283309 R45221 ESTs; WeaWy similar to HI! ALU SUBFAMI 2.7 

65 128422 T85681 yd60c06Jl Soares fetal Ever spleen INF 2.7 

109494 Hs.43899 AA233702 ESTs 2.7 

118696 Hs.292284 N72086 Homo sapiens RNA polymerase 111 largest 2.7 

106053 Hs.36727 AA416963 ESTs; Highly simitar to histone H2A [H.s 2.7 

104440 HsJ284380 L20492 gamma^lutamytoansferase 1 2.7 
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129426 Hs.111323 AA412087 EST; Highly simflar to (defltne not aval 27 

123798 AA620411 small inducfcle cytokme A5 (RANTES) 2.7 

106716 Hs.238928 AA464962 ESTs 2.7 

103663 Z76291 Z78291 Homo sapiens brain fetus Homo sap 2.7 

5 114162 HS32265 Z38909 ESTs 2.7 

113063 Hs.5027 T32438 ESTs 2.7 

127897 AA773857 af80c09.r1 Soares_NhHMPu_St Homo sapiens 2.7 

130621 Hs.16803 AA621718 ESTs; Weakly similar to (deflins not ava 2.7 

1 16245 Hs.42796 " AA479958 ESTs; Highly similar to (defline not ava 2.7 

10 125499 R11878 yf49d11.r1 Soares infant brain 1 NIB Homo 2.7 

133960 Hs.77899 M19267 tropomyosin 1 (alpha) 2.7 

104470 Hs.246358 N28843 ESTs; Weakly similar to Similar to coBa 2.7 

134982 Hs.92308 N46086 ESTs 2.7 

106803 Hs.284295 AA479114 ESTs 2.7 

15 104899 Hs.285574 AA054726 ESTs 2.7 

125401 Hs.337585 AI204637 ESTs; Moderately similar to KIAA0350 (H. 2.7 

111253 Hs.15768 N70042 ESTs; Moderately similar to !!!! ALU SUB 2.7 

118449 Hs.164478 N66413 ESTs; Weakly similar to (defline not ava 2.7 

134507 Hs.84318 M63488 replication protein A 1 (70kD) 2.7 

20 121609 Hs.98185 AA416867 EST 2.7 

113835 Hs57475 W56590 ESTs 2.7 

113962 Hs.285290 W86375 ESTs; Highly similar to (defline not ava 2.7 

121913 Hs.98558 AA42B062 ESTs 2.7 

108194 Hs.216717 AA057250 ESTs 2.7 

25 130799 Hs.12696 AA464273 ESTs 2.7 

123184 Hs.18166 AA489072 Homo sapiens mRNA for KIAA0870 protein; 2.7 

103420 Hs.173497 X97065 SEC23-like protein B 2.7 

106186 Hs.6315 AA427398 acetylserotoninN-methyltransferase-like 2.7 

101349 L77559 Homo sapiens DGS-B partial mRNA 2.7 

30 112954 Hs.6655 T16559 ESTs 2.7 

133054 Hs^91079 R07876 ESTs; Weakly similar to unknown [Sxerev 2.7 

128131 Hs.25640 AI283162 daudin3 2.6 

101864 Hs.75777 M95787 transgelin 2.6 

111948 Hs.26303 R40752 ESTs 2.6 

35 130145 Hs.151051 U07620 protein kinase mitogen-activated 10 (MAP 2.6 

126507 Hs.23964 AI362218 ESTs 2.6 

117903 Hs.47111 N50740 ESTs 2.6 

116345 Hs.199067 AA495981 ESTs 2.6 

132227 Hs.4248 AA412620 ESTs 2.6 

40 125746 HS274256 K03574 yj42b06j1 Soares placenta Nb2HP Homo sa 2.6 

105073 Hs.89463 AA137034 ESTs 2.6 

102764 U82310 Homo sapiens unknown protein mRNA, parti 2.6 

131367 Hs.173933 AA456687 ESTs 2.6 

130792 Hs.19500 AA307896 nuclear localization signal deleted in v 2.6 

45 107427 Hs.46736 W26975 ESTs 2.6 

117477 Hs.44175 N30328 ESTs 2.6 

106290 Hs.16364 AA435542 ESTs 2.6 

126829 Hs.7910 R11547 ESTs 2.6 

118836 Hs.173001 N79820 ESTs 2.6 

50 100147 Hs.136348 D13666 osteoblast specific factor 2 {fasciciin 2.6 

104278 Hs.109253 C02582 ESTs; Highly similar to (defline not ava 2.6 

135051 Hs.83484 C15324 ESTs 2.6 

126081 Hs.227835 AI346024 collagen; type I; alpha 1 • 2.6 

123579 AA608983 af5d4.s1 Soares_testis_NHT Homo sapiens 2.6 

55 130115 Hs.149923 M31627 X-box binding protein 1 2.6 

101434 Hs.1430 M20218 coagulation factor XI (plasma thrombopla 2.6 

122962 Hs.104720 AA478429 ESTs; Moderately similar to !!!! ALU SUB 2.6 

126151 Hs.40808 AA324743 ESTs 2.6 

128925 Hs21851 061676 Homo sapiens mRNA; cDNA DKFZp586J21 18 (f 2.6 

60 128919 Hs.103391 L27559 insulin-like growth factor binding prote 2.6 

130296 Hs.154103 R09286 LIM protein (similar to rat protein Wna 2.6 

128402 Hs.191637 AA457244 ESTs 2.6 

129273 Hs.109968 W63783 ESTs 2.6 

125483 Hs.7788 F07759 ESTs 2.6 

65 132953 Hs.321264 AA029927 ESTs 2.6 

130963 Hs.21639 U57099 nudear protein; marker for differential 2.6 

120614 Hs.194154 AA284281 ESTs; Weakly similar to HI! ALU SUBFAMI 2.6 

123251 Hs.103267 AA490858 ESTs; Moderately similar to Rabtn3 (Rj» 2.6 

121710 Hs.96744 AA419011 ESTs 2.6 
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125428 Hs.851 W74608 ESTs; Highly sfmflar to (defline not ava 2.6 

115906 Hs.82302 AA436616 ESTs 2.6 

108432 AA076626 Homo sapiens done 23851 mRNA sequence 2.6 

126191 Hs.191911 H97728 ESTs 2.6 

5 106164 Hs.281434 AM25773 ESTs 2.6 

111519 Hs.268615 R08165 ESTs 2.6 

134590 Hs.173840 W58612 ESTs 2.6 

102565 U59748 Human desert hedgehog (hDHH) mRNA, parti 2.6 

129879 Hs.13109 AA194973 ESTs 2.6 

10 114264 Hs.334609 Z40074 ESTs 2.6 

106236 Hs.21104 AA429951 ESTs 2.6 

135192 Hs.321709 ARJ0Q234 purinergic receptor P2X; ligand-gated to 2.6 

109833 Hs.29889 H00580 ESTs 2.6 

105756 Hs.8535 AA303088 ESTs; Weakly similar to transformation-r 2.6 

15 121422 Hsl97967 AA406210 ESTs 2.6 

130417 Hs.1 55485 U58522 Human huntingtin interacting protein (HI 2.6 

124312 Hs.102329 H94647 ESTs 2.6 

108998 Hs.97199 AA156058 ESTs 2.6 

127081 Hs.1 80591 R88362 ESTs; Weakly similar to weak similarity 2.6 

20 129574 Hs.1 1463 AA458603 ESTs; Weakly similar to (defline not ava 2.6 

112410 Hs^6904 R61680 ESTs 2.6 

123929 Hs.1 12981 AA621364 ESTs 2.6 

122905 Hs.104835 AA470070 ESTs 2.6 

116399 Hs.1 10637 AA599729 Homo sapiens homeobox protein A10 (HOXA1 2.6 

25 130279 Hs.153934 AA424044 core-binding (actor; runt domain; alpha 2.6 

130021 Hs.1 435 M24470 guanosine monophosphate reductase 2.6 

100585 Hs.1 99160 HG2367-HT2463 Trithorax Homolog Hrx 2.6 

104965 Hs.30177 AA084104 ESTs 2.6 

117711 Hs.46485 N45201 EST 2.6 

30 124792 Hs.48712 R44357 ESTs 2.6 

111299 Hs.74313 N73808 ESTs 2.6 

103616 Hs.32971 246973 -phosphoinositide-3-kinase; class 3 2.6 

133629 Hs.195614 D13642 KIAA0017 gene product 2.6 

126484 Hs.1 69977 AI086782 ESTs 2.6 

35 100858 H64245-HT4515 Forkhead Family Atx1 2.6 

133547 Hs.301927 X02883 T-cell receptor; alpha (V;D;J;C) 2.6 

126680 Hs.133865 F07097 ESTs 2.6 

125739 Hs.92137 AA428557 v-myc avian myelocytomatosis viral oncog 2.6 

102276 Hs.10247 U30999 Human (memc) mRNA, 3UTR 2.6 

40 105586 Hs.191538 AA279137 ESTs 2.6 

103978 Hs.34136 AA307443 ESTs 2.6 

125054 Hs.268601 T80622 ESTs; Weakly similar to (defline not ava 2.6 

114212 Hs.21201 Z39338 ESTs; Highly similar to (defline not ava 2.6 

116959 Hs.40022 H79310 EST 2.6 

45 109228 Hs.306995 AA193366 ESTs 2.6 

133989 Hs.78202 U29175 SWl/SNF related; matrix associated; act! 2.6 

100640 Hs.182183 HG2743-HT2845 Catoesmonl, Alt Splice 3, Non-Musde 2.6 

133093 Hs-285996 AA598749 ESTs 2.6 

114306 Hs.6540 Z40861 ESTs 2.6 

50 106060 Hs.171391 AA417287 C-termina I binding protein 2 23 

107748 Hs.60772 AA017258 EST 2.5 

100134 Hs.49 D13264 macrophage scavenger receptor 1 2.5 

133969 Hs78 U13044 GA-binding protein transcription factor; - 2.5 

130992 Hs.74316 AA455001 ESTs 2.5 

55 127493 Hs.291701 AA808081 oc39a08.s1 NCLCGAP.GCBI Homo sapiens cO 2.5 

132869 Hs.203961 N26855 ESTs 2.5 

117570 Hs.44583 N34415 EST 2.5 

124644 Hs.109654 N91279 ESTs 2.5 

103558 Hs2785 219574 keratin 17 25 

60 132883 Hs.5897 AA047151 ESTs 2.5 

102009 Hs.82643 U02680 protein tyrosine kinase 9 2.5 

116058 Hs.20159 AA454156 ESTs 2.5 

121989 Hs.193784 AA430044 ESTs 25 

131257 Hs.24908 AA256042 ESTs 25 

65 100320 Hs.75275 D50916 homolog of yeast (S. cerevisiae) ufd2 2.5 

102959 Hs.121524 X15722 glutathione reductase 25 

132969 Hs.6166 AA047616 ESTs 25 

130869 Hs2057 AA12810O uridine monophosphata synthetase (orotat 25 

129645 Hs.1 18131 L38928 5;10-methenyltetrahydrofoJate synthetase 25 
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125399 


Hs.83883 


AA1 28075 


ziidquo.i1 oOaiQS jrGgnani^ui6ru5_iNDnru 


C..Q 


134069 


HSJ8935 


U29607 


Homo sapiens elF-2-associated p67 homok> 


2.5 


109816 


Hs.61960 


F11013 


ESTs; Weakly similar to KIAA0176 [H.sapl 


2.5 


134801 


Hs.89695 


X02160 


insulin receptor 


2.5 


104232 


Hs.10567 


AB002351 


Human mRNA for KIAA0353 gene; partial cd 


2.5 


107361 


Hs.1 59486 U72513 


Human RPL13-2 pseudogene mRNA; complete 


2.5 


106057 


Hs.289074 AM17067 


ESTs 


2.5 


134252 


Hs.80720 


AA031782 


Homo sapiens mRNA; cONA DKFZp586B1722 (f 


2.5 


128062 


Hs.105547 AA379500 


ESTs 


2.5 


110009 


Hs.6614 


H10933 


ESTs 


2.5 


111375 


Hs.20432 


N93696 


ESTs 


2.5 


122642 


Hs.99361 


AA454186 


ESTs 


2.5 


127999 


Hs.69851 


AA837495 


ESTs; Weakly similar to Wiskott-Aldrich 


2.5 


105029 


Hs.13268 


AA126855 


ESTs 


2.5 


105082 


Hs^6765 


AA143763 


ESTs; Weakly similar to Similarity to S. 


2.5 
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TABLE 1 A show the accession numbers for those primekeys lacking unigeneDD's for Table 
1. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pfcey CAT number Accessions 



108552 
126023 
126086 
102565 
101964 
125499 
125596 
118417 



11 1555 J 

1596090J 

1606216J 

32479J 

48158.-7 

1562851J 

1708455J 

371 86 J 



125661 
125957 
125982 
127248 
103731 
127261 
127265 
126659 
127315 
103806 
128104 
104602 
128152 
128422 
127897 
106566 



327827J 

1583542J 

1766315.1 

227560.1 

112052.1 

231687J 

232391.1 

1541209.1 

37938.1 

112618.1 

502608J 

524482 J. 

297868.1 

1811283.1 

446527.1 

120358.1 



129735 44573.2 



123147 
130529 
123579 
109175 
100789 
100858 



AA071210 AA069899 AA071438 AA084912 AA084803 AA079371 AA079370 

H57661 H58881 

H75681 H70975 

AB010994 U59748 AA064660 

S81578 

H10543R11878 
R25698 R56582 R56018 

AF0B0229 AF080231 AF080230 AF080232 AF08Q233 AF080234 BE550633 AI636743 AW614951 BE467547 AI680833 
A1633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 AI970376 AI583718 A1672574 
N25695 AW665466 AI818326 AA126128 AI480345 AW0 13827 AA248638 AI214968 AA204735 AA207155 AA206262 
AA204833 AW003247 AW496808 AI080480 AI631703 AI551023 AI867418 AW818140 AA502500 AI206199 AI671282 
AI352545 BE501030 A1652535 BE465762 AA206331 AW451866 AA471088 AA206342 AA204834 AA206100 AW021661 
AA332922 N66048 AA703396 H92278 AW139734 H92683 U87589 U87595 H690O1 U87594 BE466420 AI624817 
BE46661 1 AI206344 AA574397 AA348354 AI493192 

AA491 830 R50173 R55192 R50320 AI732306 AI732305 AI820727 A1820728 R55191 R50319 R50227 
H41694H45213 



2198Q2.-2 
158447.1 

genbanK_AA608983 
genbanK-AA180496 
tigr.HT4163 
tigr_KT4515 



AA364195 AA325029 AW962050 

AA070545 AA131490 AA131373 

AA330501 AA661567 

AA331503 AA332751 AW962542 

T16245 R19694 F13545 H10299 T66048 T65279 H18006 

AF1 16622 All 14507 AA640834 AA377999 

AA130614AA071410 

AA906093AA971000 

K47610R66920 

F07973R20353AA442660 

T77794T85681 

AA773681 AA773857 

BE298210 AI672315 AW086489 BE298417 AA455921 AA902537 BE327124 R14963 AA085210 AW274273 AI333584 
AI369742 AI039658 AI885095 AI476470 AI287650 AI885299 A1985381 AW592624 AW340136 AI266556 AA456390 
A1310815AA484951 

AI950087 N70208 R97040 N36809 AI3081 19 AW967677 N35320 AG51473 H59397 AW971573 R97278 W01059 
AW967671 AA908598 AA251875 AI820501 AI820532 W87891 T85904 U71456 T82391 BE328571 T75102 R34725 
AA884922 BE328517 AI219788 AA884444 N92578 F13493 AA927794 AI560251 AW874068 AL134043 AW235363 
AA663345 AW008282 AA488964 AA283144 AI890387 AI950344 AI741346 AI689062 AA282915 AW102898 AI872193 
AI763273 AW173586 AW150329 AI653832 AI762688 AA988777 AA488892 AI356394 AW103813 AI539642 AA642789 
AA856975 AW505512 A1961530 AW629970 BE612881 AW276997 AW513601 AW512843 AA044209 AW856538 
AA180009 AA337499 AW961 101 AA251669 AA251874 AI819225 AW205862 AI683338 AI858509 AW276905 AI633006 
AA972584 AA908741 AW072629 AW513996 AA293273 AA969759 N75628 N22388 H84729 H60052 T92487 A1022058 
AA780419 AA551005 W80701 AW613456 AO73032 AI564269 F00531 H83488 W37181 W78802 R66056 A1002839 
R67840 AA300207 A W9 59581 T63226 F04005 
AA487961 

AA178953AA192740 

AA608983 

AA180496 

S67998 

U10072 
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12379S 
102116 
102398 
102764 
118475 
104776 
104787 
113702 
113938 
122635 
108407 
108432 
108555 
101349 
124447 
119071 
103520 
103663 
128046 
126959 
123465 



579959J 

entrez_U13706 

entrez_U42359 

entrez_U82310 

genbankJ166B45 

genbanK_AA026349 

genbank_AA027317 

genbanK_T97307 

genbanleW81598 

genbanK_AA454085 

genbank_AA075519 

genbank_AA076626 

genbanK_AA084963 

entrezj.77559 

genbanK_N48000 

genbanK_R31180 

entrezJMOSH 

genban)eZ78291 

877605 1 

546044J 

genbanK_AA599033 



AA620411 AA287491 

U13706 

U42359 

U82310 

N66845 

AA026349 

AA027317 

T97307 

W81598 

AA454085 

AA075519 

AA076626 

AA084953 

L77559 

N48000 

R31180 

Y10511 

Z78291 

AA8732B5AI025762 
AA199853AA206355 
AA599033 
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TABLE 2: shows a preferred subset of the Accession numbers for genes found in Table 1 
which are differentially expressed in prostate tumor tissue compared to normal prostate 
tissue. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1 : Ratio of tumor to normal body tissue (Relaxed ratio (87/70) 



Pkey ExAccn UnlgenelO Unigene Title 



R1 



131919 
120328 
101486 
119073 
133428 
128180 
104080 
127537 
131665 
101050 
130771 
107485 
106155 
129534 
100569 



AA121266 

M196979 

M24902 

R32894 

M34376 

AA595348 

AA402971 

AA569531 

R22139 

K01911 

N48056 

W63793 

AA425309 

R73640 



Hs.272458 

Hs.290905 

Hs.1852 

Hs.279477 

Hs.183752 

Hs.171995 

Hs.57771 

Hs.162859 

Hs.30343 

Hs.1832 

Hs.1915 

Hs.262476 

Hs.33287 

Hs.11260 



ESTs 

ESTs; Weakly similar to (deflina not ava 

acid phosphatase; prostate 

ESTs 

microseminoprotein; beta* 

kallikrein 3; (prostate specific antigen 

Homo sapiens mRNA for serine protease (T 

ESTs 

ESTs 

neuropeptide Y 

folate hydrolase (prostate-specific memb 
S-adenosylmethionine decarboxylase 1 
ESTs 
ESTs 



135389 
133944 
130974 
114768 
104660 
131061 
126645 
135153 
107033 
118417 
126758 
107102 
116787 
115719 
123209 
101664 
112971 
117984 
129523 
132964 
121853 
119617 
105627 
101461 
124526 
133845 
133354 
119018 
100394 
106579 
114965 
112033 
102398 
101201 
101803 
120562 



HG2261-HT2351 

Hs.181350 
Hs.99872 
Hs.7780 
Hsi178 
Hs.182339 
Hs.14846 
HSL268744 
Hs.61635 
Hs.95420 
Hs.1 13314 



U05237 
AA045870 
X57985 
AA 14 5007 
AA007160 
N64328 
A1167942 
N40141 
AA599629 
N66048 
W37145 
AA609723 
H28581 
AA416997 
AA489711 
M60752 
T17185 
N51919 
M30894 
AA031360 
AA425887 
W47380 
AA281245 
M22430 



T68510 

AA055552 

N95796 

D84276 

AA456135 

AA250737 

R43162 

U42359 

L22524 

M86546 

AA280036 



kallikrein 2; prostatic 
fetal Alzheimer antigen 
ESTs 

H2B histone family, member Q 
ESTs 
ESTs 

ESTs; Moderately similar to KIAA0273 [H. 
Homo sapiens BAC clone RG041D1 1 from 7q2 10.7 
Homo sapiens mRNA for JM27 protein; comp 1 0.6 



37.2 

32.6 

25.2 

24.8 

23.8 

21.4 

18.9 

18.6 

17.4 

17.3 

17 

16.7 

165 

16.4 

Antigen, Prostate Specific, AIL Splice 1 6 
15.4 
15 
12.5 
11.B 
11.8 
11.4 
10.9 



H&293960 

Hs.30652 

Hs.15641 

Hs59622 

Hs^03270 

Hs.121017 

Hs.83883 

Hs.106778 

HS274509 

Hs.167133 

Hs.98502 

Hs.55999 

Hs.23317 

Hs.76422 

Hs.293185 

Hs.76704 

Hs.334762 

Ks.278695 

Hs.66052 

Hs.23023 

Hs.72472 

Hs^2627 

Hs.2256 

Hs.155691 

Hs.302267 



ESTs 

ESTs; Weakly similar to polymerase [H.sa 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

H2A histone family; member A 

ESTs 

ESTs 

T-cell receptor, gamma cluster 

ESTs 

ESTs 

ESTs 

ESTs 

phosphofipase A2; group IIA (platelets; 
yz61c5.s1 Soares_rnuItipte_sclerosis_2NbH 
ESTs 

ESTs; Weakly similar to KIAA0319 [H^api 
ESTs 

CD38 antigen (p45) 

ESTs 

ESTs 

ESTs 

Human N33 protein form 1 (N33) gene, exo 
matrix metaBoproteinase 7 (matriiysin; 



ESTs; WeaWy simila/ to W01A6.C [Cetega 
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10.6 
105 
102 
10.1 
10.1 
10 
9.9 
9.8 
9.7 
9.7 
9.4 
92 
9 

8.9 

8.8 

8.7 

8.5 

8.2 

8.1 

8 

8 

7.6 
7.4 
7.1 
7 

6.9 
6.3 
6.8 
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109112 AA169379 HS257924 ESTs 6.8 

109795 F10707 Hs.326416 ESTs 6.7 

130336 X07730 Hs.171995 kailikrein 3; (prostate specific anfigen 6.6 

131425 AA219134 H$26691 ESTs 6.6 

5 132802 AA490969 H&59838 ESTs 6.6 

133724 U07919 Hs.75746 aldehyde dehydrogenase 6 6.5 

120215 Z41050 Hs.108787 Homo sapiens Mcd4p homoSog mRNA; comptet 65 

131681 AA010163 Hs3383 upstream regulatory element binding prat 6.5 

100727 X07290 Hs.334786 Human HF.12 gene mRNA 6.3 

10 121770 AA421714 HS278428 Homo sapiens mRNA for KIAA0896 protein; 6.3 

123475 AA599267 Hs25G528 ESTs; WeaWy similar to ANKYRIN; BRAIN V 6.3 

133061 AB000584 Hs296638 prostate diHerentiation factor 6.3 

116429 AA609710 Hs279923 ESTs; Weakly similar to similar to GTP-b 62 

101233 129003 Hs.878 sorbitol dehydrogenase 62 

15 104691 AA011176 Hs.37744 ESTs 62 
127248 AA325029 EST27953 Cerebellum II Homo sapiens cDNA62 

105500 AA256485 HS222399 ESTs 6.1 

130828 AA053400 HS203213 ESTs 5.9 

115357 AA281793 Hs.72988 ESTs 5.8 

20 116334 AA491457 Hs.48948 ESTs 5.7 

120132 Z38839 Hs,125019 ESTs; Weakly similar to i!H ALU SUBFAM1 5.6 

106375 AA443993 Hs289072 ESTs 5.6 

124777 R41933 Hs. 140237 ESTs; Weakly similar to neuronal thread 5.6 

101791 M83822 Hs.62354 .Human beige-Eke protein (BGL) mRNA; par 55 

25 117698 N41002 Hs.45107 ESTs 5.5 

122041 AA431407 Hs.98732 Homo sapiens Chromosome 16 BAG clone CIT 5.5 

133723 AA088851 Hs262476 S-adenosylmethionine decarboxylase 1 5.5 

113938 W81598 ESTs 5.4 

133015 AA047036 Hs246315 ESTs 5.4 

30 108186 AA056482 Hs.7780 ESTs 5.3 

104466 N25110 Hs326392 Human guanine nucleotide exchange tactor 5.3 

104033 AA365031 Hs.98944 ESTs 5.3 

110844 N31952 Hs.1 67531 ESTs; Weakly similar to (deftine not ava 5.3 

129058 H70627 Hs.108336 ESTs; Weakly similar to !!!! ALU SUBFAMl 5.3 

35 133493 AA284143 Hs.1 94369 Homo sapiens chromosome 1 atrophin-1 rel 5.3 

129184 W26769 Hs.109201 ESTs; Highly similar to (defline not ava 52 

101448 M21389 Hs.195850 keratin 5 (epidermolysis bullosa simplex 5.1 

116188 AA464728 Hs.184598 ESTs; Weakly similar to !!!! ALU SUBFAMl 5.1 

105921 AA402613 Hs.169119 ESTs 5.1 

40 103375 X91868 Hs.54416 sine ocufis homeobox (Orosophila) homoto 5.1 

128871 AA400271 Hs.106778 ESTs; Highly similar to {detline not ava 5.1 

116238 AA479362 Hs.47144 ESTs 5 

102913 X07696 Hs.80342 keratin 15 5 

103011 X52541 Hs.326035 early growth response 1 5 

45 118981 N93839 Hs.39288 ESTs; Weakly similar to UH ALU SUBFAMl 5 
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TABLE 2A shows the accession numbers for those primekeys lacking unigenelD's for Table 
2. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene duster number 

Accession; Genbank accession numbers 



Pkey CAT number Accession 

1 18417 37186J AFD80229 AFO80231 AF08G230 AF080232 AF080233 AF080234 BE550633 A1636743 AW614951 BE467547 AI680833 

AI633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 AI970376 AI583718 AI672574 
N25695 AW665466 AI818326 AA126128 A1480345 AW013827 AA248838 AI214968 AA204735 AA207155 AA206262 
AA204833 AW003247 AW496808 AI080480 AI631703 AI651023 AI857418 AW818140 AA502500 AI206199 A1671282 
AI352545 BE501030 AJ652535 BE465762 AA206331 AW451866 AA471088 AA206342 AA204834 AA206100 AW021661 
AA332922 N66048 AA703396 H92278 AW139734 H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 
BE46661 1 A1206344 AA574397 AA348354 AI493192 

127248 227560J AA364195 AA325029 AW96205O 

107033 235652 1 AI141999 AA730176 R44544 R41778 AW300793 AW966157 AA918501 AA599529 AI082195 AI19B537 AW006520 

AW236663 AW151420 AI826987 AI810832 AI569102 AC01981 N27331 AA335566 T84622 BE085347 BE085269 
102398 entre?_U42359 U42359 
113938 genbanleWB1598W81598 
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TABLE 3: shows genes, including expression sequence tags, differentially expressed in 
5 prostate tumor tissue compared to normal tissue as analyzed using the Affymetrix/Eos Hu02 
GeneChip array. Shown are the relative amounts of each gene expressed in prostate tumor 
samples and various normal tissue samples showing the highest expression of the gene. 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenetD: Unigene number 

Unigene Title: Unigene gene title 

R1 : Ratio of tumor to norma! body tissue 



Pkey ExAccn UnigenetD Unigene Title 



100131 
100235 
100570 
100819 
101063 
101247 
101416 
101447 
101485 
101514 
101626 
101663 
101758 
101768 
101817 
101888 
102031 
102052 
102221 
102233 
102302 
102348 
102457 
102473 
102669 
102698 
102751 
102823 
102869 
103031 
103043 
103093 
103376 
103401 
103613 
103677 
103962 
104084 
104257 
104301 
104769 
104851 
104896 
104956 
104957 
104967 
105099 
105298 



D12485 Hs.11951 
D29954 Hs.13421 
HG2261-HT2352 
HG4020-HT4290 
L00354 Hs.80247 



phosphodiesterase l/nucteotide pyrophosp 

KIAA0056 protein 

Hs.171995 

Hs.2387 



L33801 

M17254 

M21305 

M24736 

M28214 

M57399 

M60750 

M77836 

M81118 

M88163 

M99701 

U04898 

U07559 

U24576 

U26173 

U33052 

U37519 

U48807 

U49957 

U71207 

U75272 

U80034 

U90914 

X02544 

X54667 

X55733 

X60708 

X92098 

X95240 

Z46629 

Z83806 

AA298180 

AA410529 

AP006265 

D45332 

AA025887 

AA040882 

AA054228 

AA07488O 

AA074919 

AA084506 

AA150776 

AA233459 



Hs.78802 
Hs.279477 

Hs.89546 

Hs.123072 

Hs.44 

Hs2178 

Hs.79217 

Hs.78989 

Hs.152292 

Hs.95243 

Hs.2156 

Hs505 

Hs.3844 

Hs.79334 

Hs.69171 

Hs.87539 

Hs2359 

Hs.180398 

Hs.29279 

Hs.1867 

Hs.68583 

Hs.5057 

Hs.572 

Hs.123114 

Hs.93379 

Hs.44926 

Hs.323378 

Hs.54431 

Hs2316 

Hs.83243 

Hs.30732 

Hs-9222 

Hs.6783 

HS293943 

Hs.10290 

Hs.23165 

Hs.20509 

Ks.10026 

Hs.291000 

HS23729 

HS26369 



glycogen synthase kinase 3 beta 
v-ets avian erythroblastosis vims E26 o 
Human alpha satellite and satellite 3 ju 
selectin E (endothelial adhesion molecul 
RAB3B; member RAS oncogene (amity 
pleiotrophin (heparin binding growth fac 
H2B histone family; member A 
pyrroline-5-cartwxytate reductase 1 

SW1/SNF related; matrix associated; acti 

transcription elongation factor A (SH)- 

RAR- related orphan receptor A 

tSLI transcription factor; UM/homeodoma 

UM domain only 4 

nucfear factor; JnterfeukJn 3 regulated 

protein kinase C-like 2 

aldehyde dehydrogenase 8 

dual specificity phosphatase 4 

UM domajnKxmtatrung preferred transloc 

eyes absent (Drosophila) homolog 2 

progastncstn (pepsinogen C) 

mitochondrial intermediate peptidase 

carboxypeptidase D 

orosomucold 1 « 

cystatinS 

eukaryotic translation initiation factor 
dipeptidylpeptidase IV (CD26; adenosine 
coated vesicle membrane protein 
specific granule protein (28 kDa); cyste 
SRY (sex-determining region Y>box 9 (ca 
H sapiens mRNA for axonemal dynein heavy 
ESTs 
ESTs 

estrogen receptor-binding fragment-assoc 
ESTs 

ESTs; Weakly similar to HI! ALU SUBFAM1 
U5 snRNP-specific 40 kDa protein (hPrp8- 
ESTs 

ESTs; Weakly similar to hypothetical pro 
ESTs; Weakly similar to ORF YJL063C (S.c 
ESTs 

Homo sapiens done 24405 mRNA sequence 
ESTs 

121 



R1 

6.3 
5,1 

Antigen, Prostate Specific, Alt Spfice 
Transglutaminase 105 

as 

4.7 

4.7 

11 

9.8 

6.2 

8.4 

4.9 

5.4 

7S 

5.5 

57 

132 

8.9 

5.6 

7.4 

8.2 

5.9 

5.1 

5.7 

9 

10.6 

15.6 

45 

22.6 

4.7 

4.9 

5.8 ' 

52 

7.4 

52 

4.9 

6 

6.4 

6JB 

105 

6.3 

4.9 

5.8 

6.4 

4.8 

65 

7 

5.1 
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105304 AA233553 Hs.190325 ESTs 4.7 

105370 AA238476 H 522791 ESTs; Weakly similar to transmembrane pr 103 

105427 AA251330 H&28248 ESTs 5 

105542 AA261858 Hs.266957 ESTs; Weakly similar to heat shock prate 8.8 

5 105628 AA281251 Hs.79828 ESTs; Weakfy similar to putative zinc fi 5.5 

105640 AA281623 Hs.6685 ESTs; Weakly similar to K1AAQ742 protein 8 

105645 AA282138 Hs.1 1325 ESTs 14 

105691 AA287097 Hs.289068 transcription factor 4 6.3 

105730 AA292701 Hs5364 OKFZP5641052 protein 4.9 

10 105608 AA393808 Hs.286131 KIAA0438 gene product 7 

105826 AA398243 Hs. 194477 ESTs; Moderately similar to similar to N 5 

105903 AA401433 Hs.200016 ESTs; WeaWy similar to diphosphoinosito 9.9 

105906 AA401633 Hs.22380 ESTs 115 

106065 AA417558 Hs.25206 ESTs 5.1 

15 106094 AA419461 Hs.23317 ESTs 10.9 

106157 AA425367 Hs.34892 ESTs 6.6 

106184 AA426643 Hs.10762 ESTs 85 

106211 AA428240 Hs.126083 ESTs 8.4 

106213 AA428258 Hs.8769 Homo sapiens mRNA; cONA DKFZp564E153 (fr 5.7 

20 106272 AA432074 Hs323099 ESTs 5.8 

106369 AA44382B Hs.288856 ESTs 6.3 

106400 AA447621 Hs.94109 ESTs 5.4 

106474 AA450212 Hs.42484 Homo sapiens mRNA; cONA DKFZp564C053 (fr 9.2 

106507 AA452584 Hs.26781 9 protein phosphatase 1; regulatory (inhfo 5.6 

25 106523 AA453441 Hs.31511 ESTs 4.7 

106532 AA453628 Hs.37443 ESTs 4.7 

106557 AA455087 Hs.22247 ESTs 5.7 

106575 AA456039 Hs.105421 ESTs 72 

106618 AA459249 Hs.8715 ESTs; Weakly similar to Similarity with 5.6 

30 106820 AA481037 Hs.12592 ESTs 5.4 

106846 AA485223 Hs.34892 ESTs 5.3 

106973 AA505141 Hs.11923 Human DNA sequence from done 167A19 on 7.5 

107110 AA609952 Hs.12784 KIAA0293 protein 6.1 

107127 AA620504 Hs.179898 ESTs 7.1 

35 107159 AA621340 Hs.10600 ESTs; Weakly similar to ORF YKR081C [S.c 5.2 

107217 D51095 Hs.35861 DKFZP586E1 621 protein 15.1 

107365 U78294 Hs.1 11 256 arachidonate 15-lipoxygenase; second typ 4.7 

107630 AA007218 Hs.60178 ESTs 5.3 

107734 AA016225 Hs.7517 ESTs 4.8 

40 107760 AA018042 Hs£52085 EST 7.6 

107997 AA037388 Hs32223 Human DNA sequence from clone 1 41 H5 one 105 

108012 AA039616 Hs.173334 ESTs 65 

108520 AA084138 Hs.46786 ESTs 7.9 

108583 AA088276 Hs.68826 ESTs 5.6 

45 108613 AA100967 Hs.69165 ESTs 6 

108664 AA113349 Hs.69588 EST 63 

108677 AA1 15629 Hs.1 18531 ESTs 5.9 

108807 AA129966 Hs.49376 ESTs; WeaWy similar to PROTEIN PHOSPHAT 5.8 

108910 AA136590 ESTs 5 

50 108933 AA147224 Hs337232 ESTs 12.7 

108948 AA149579 Hs.118258 ESTs 6.8 

109014 AA156790 Hs.262036 ESTs 15.3 

109124 AA171529 Hs.183887 ESTs 6.1 * 

109142 AA176438 Hs.41295 ESTs 5.1 

55 109277 AA196332 Hs.86043 ESTs 5.5 

109342 AA213620 Homo sapiens mRNA; cONA DKFZp586M1418 (f6 

109562 F01811 Hs.187931 ESTs; Moderately simitar to voltage-gate 10.8 

109565 F01930 Hs.23648 ESTs 7 

109648 FO4600 Hs.7154 ESTs 9.9 

60 109799 F10770 Hs.180378 Homo sapiens done 669 unknown mRNA; com 6.4 

109859 H02308 Hs.20792 ESTs 53 

110181 H20276 Hs.31742 ESTs 16.8 

110854 N32919 Hs.27931 ESTs 10 

110924 N47938 Hs.12940 yy84a09.s1 Soares_muItiple_sderosis^2Nb 5.6 

65 111046 N55514 Hs3 18584 ESTs 6.9 
111091 N59858 Hs33032 Homo saptens mRNA; cONA DKFZp434N185 (fr 52 

111157 N66613 Hs.99364 ESTs 5 

111164 N66857 Hs.1 22489 ESTs; Weakly similar to fill ALU CLASS C 5.6 

111221 N68869 Hs.15119 ESTs 52 
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111348 N90041 Hs.9585 ESTs 5.4 

111353 N90430 Hs.6616 ESTs 5.3 

111495 R07210 Hs.9883 ESTs 5.8 

111540 R08850 Hs3786 ESTs 6 

5 111579 R10657 Hs.167115 KIAA0830 protein 12.6 

111581 R10684 Hs3794 ESTs 7.1 

111734 R25375 Hs.128749 ESTs 63 

111861 R37460 Hs35231 ESTs 9.4 

111870 R37778 Hs.18685 ESTs; Weakly similar to hypothetical pro 65 

10 111937 R40431 Hs.14846 Homo sapiens mRNA; cONA DKFZp564D016 (fr 4.8 

111987 R42036 Hs.6763 KIAAQ942 protein 6.4 

112184 R49173 Hs330242 ESTs 5.6 

112286 R53765 Hs.158135 KIAA0981 protein 93 

112380 R59740 Hs.5740 ESTs 4.7 

15 112452 R63841 Hs.157461 ESTs 6 

112601 R79111 Hs.78225 annexinAI 5.4 

112753 R93696 Hs.169882 ESTs 53 

112902 T09262 Hs.129190 ESTs 5.1 

112984 T23457 Hs389014 ESTs 43 

20 113021 T23855 Hs.129836 WAA1Q28 protein 10.8 

113083 T40530 Hs366957 ESTs; Weakly similar to heat shock prote 5.7 

113200 T57773 Hs.10263 ESTs 7.3 

113494 T88878 Hs.86538 ESTs 8.7 

113849 W60439 Hs.8858 ESTs; Moderately similar to cbp 146 [Mjrtu 43 

25 113883 W72382 Hs.11958 oxidative 3 alpha hydroxysteroid dehydro 4.7 

113950 W85765 Hs.30504 Homo sapiens mRNA; cDNA DKFZp434E082 (fr 6.7 

113986 W87462 Hs31894 ESTs 5.9 

113989 W87544 Hs368828 ESTs 4.7 

114124 238595 Hs. 12501 9 ESTs; Highly similar to KIAA0886 protein 213 

30 114340 Z41395 Hs.143611 ESTs 9.6 

114346 Z41450 Hs.130489 ESTs 5.2 

1 14435 AA018216 Hs.164975 Bicaudal D (Drosophila) homolog 1 7.4 

114463 AA025370 Hs.40109 KIAA0872 protein 83 

114652 AA101416 Hs.1 071 49 ESTs; Weakly similar to PTB-ASSOCIATED S 5.4 

35 114721 AA131450 Hs.103822 ESTs 43 

114730 AA133527 Hs.331328 ESTs; Weakly similar to The KIAA01 38 gen 5.1 

114833 AA234362 Hs.87159 ESTs; Moderately similar to CGI-66 prote 5.5 

114860 AA235112 Hs.42179 ESTs; Moderately similar to similar to m 63 

114884 AA235811 Hs393672 ESTs 5.2 

40 114895 AA236177 Hs.76591 KIAA0887 protein 4.7 

114908 AA236545 Hs.54973 ESTs 53 

114932 AA242751 Hs.16218 KIAA0903 protein 5.7 

115084 AA255566 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564C053 (fr 53 

115140 AA258030 Hs379938 ESTs; WeaJdy similar to supported by GEN 5.9 

45 115468 AA287061 Hs.48499 ESTs; Highly similar to Bdeight protein 4.7 

115583 AA398913 Hs.45231 LTOC1 protein 7.6 

115709 AA412519 Hs38279 ESTs 4.8 

115772 AA423972 Hs.131740 ESTs 5 

115774 AA424029 Hs388390 ESTs; Moderately simBar to dynamin; int 5.4 

50 115776 AA424038 Hs.81897 ESTs 5 

115821 AA427528 Hs.130965 ESTs; Weakly similar to ZINC FINGER PROT 13.7 

115955 AA446121 Hs.44198 Homo sapiens BAG clone RG054D04 from 7q3 10.6 

116024 AA451748 Hs33883 Human ONA sequence from clone 71 8J7 one 63 - 

116108 AA457566 Hs38777 ESTs 6 

55 116117 AA459117 Hs.31575 SEC63; endoplasmic reticulum translocon 7.3 

116146 AA460701 Hs.15423 ESTs 53 

116296 AA489033 Hs.62601 Homo sapiens mRNA; cDNA DKFZp586K1318 (f 5.7 

116379 AA521472 Hs.71252 ESTs 5.9 

116393 AA599463 Hs.306051 protein phosphatase 2 (formerly 2A); re g 5.9 

60 116401 AA5999S3 Hs.59698 ESTs 7.9 

116416 AA609219 Hs.39982 ESTs 93 

116587 D59325 Hs.121429 ESTs 53 

116601 D80055 Hs.45140 ESTs 43 

116684 F09156 Hs.66095 ESTs 73 

65 116722 F13654 HSRH32 Stratagene cat#9372 12 (1992) Horn 5.5 

116766 H13260 Hs.95097 ESTs 5.9 

117453 N29568 Hs.108319 thyroid hormone receptor-associated prot 63 

117557 N33920 Hs.44532 diubiquffin 43 

117708 N45114 Hs.126280 ESTs 6.3 
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11B001 N52151 Hs.47447 ESTs 11.4 

118229 N62339 Hs.166254 heat shock 90kD protein 1; alpha 62 

118599 N69207 Hs203697 ESTs 5.8 

118645 N70358 Hs.1 251 BO growth hormone receptor 7.1 

5 118873 N89881 Hs.44577 ESTs 6 

118985 N94303 Hs.55028 ESTs 9.3 

119107 R42424 Hs.63841 ESTs 6 

119126 R45175 Hs.1 17183 ESTs 17.9 

119271 T16387 Hs.65328 ESTs 6 

10 119367 T78324 Hs.250895 ESTs 5 

119721 W69440 Hs.48376 ESTs 15.4 

119741 W70205 Hs.43670 Wnesin (amity member 3A 10.1 

119780 W72967 Hs.191381 ESTs; Weakly similar to hypothetical pro 5.3 

120217 Z41078 Hs.66035 ESTs 4.8 

15 120266 AA173939 Hs205442 ESTs; Weakly similar to inner centromere 8.8 

120294 AA190888 Hs.153881 ESTs; Highly similar to NY-REN-62 antige 4.9 

120418 AA236010 Hs26613 Homo sapiens mRNA; cONA DKFZp586F1323 (f 4.7 

120486 AA253400 Hs.137569 tumor protein 63 kDa with strong homotog 5.6 

120524 AA261852 Hs.192905 ESTs 4.9 

20 120571 AA280738 Hs.34892 ESTs 8.8 

120596 AA282074 Hs237323 ESTs 62 

120713 AA292655 Hs35557 ESTs 9.9 

120992 AA398246 Hs.97594 ESTs 16.4 

121429 AA406293 Hs.41167 ESTs 6.9 

25 121503 AM12049 Hs290347 ESTs 7.6 

121512 AA412105 Hs.193736 ESTs 5.8 

121816 AA424814 Hs.48827 ESTs 4.6 

122027 AA431302 Hs.98721 EST; Weakly similar to N-copIne [H^apie 5.6 

122294 AA437311 HS.9B927 ESTs 5.7 

30 122411 AA446859 Hs.99083 ESTs 6.5 

122791 AA460158 Hs.129836 KIAA1 028 protein 12.4 

122792 AA460225 Hs.99519 ESTs 5.1 
122969 AA478539 Hs.104336 ESTs 4.9 
123095 AA485724 Hs27413 ESTs 5.4 

35 123100 AA485957 Hs.306219 Homo sapiens clone 25032 mRNA sequence 5 

123295 AA495981 Hs250830 ESTs 4.7 

123311 AM96252 Hs.105069 ESTs 7.4 

123583 AA609006 Hs.111240 ESTs 9.1 

123619 AA609200 ESTs 4.7 

40 123645 AA609310 Hs.188691 ESTs 4.8 

123709 AA609651 Hs.1 12742 ESTs 7 

123968 C14333 Hs.108327 damage- speci He DNA binding protein 1 (1 5 

124178 H45996 Hs.97101 putative G protein-coupled receptor 6.8 

124352 N21626 Hs.1O2406 ESTs 102 

45 124357 N22401 yw37g07.s1 Morton Fetal Cochlea Homo sap 10.6 

124515 N58172 Hs.109370 ESTs 142 

124911 R88992 Hs.174195 ESTs 4.8 

125154 W38419 ESTs 4.7 

125992 W01626 za36e07.r1 Soares fetal liver spleen 1NF 5.1 

50 126B02 AA947601 Hs.97056 ESTs 5.1 

126812 Z36290 Hs.173933 ESTs; Weakly similar to NUCLEAR FACTOR 1 4.6 

127080 AA662913 Hs.190173 ESTs 5 

127308 AA507628 Hs.334390 ESTs 4.8 • 

127370 AI024352 Hs.70337 immunoglobuiin superfamBy; member 4 4.7 

55 127386 AI457411 Hs.106728 ESTs 4.8 

127965 AA828760 Hs292059 ESTs 4.8 

128172 AI400862 Hs265130 ESTs 5 

128305 AI039722 Hs279009 ESTs 5.8 

128420 AI088155 Hs.41296 ESTs; Weakly similar to unknown [Rsapie 17 

60 128467 AA176446 Hs.180428 ESTs; Weakly similar to hypothetical 43. 4.8 

128610 L38608 Hs.1 0247 activated leucocyte cell adhesion molecu 75 

128625 AA242816 Hs.102652 ESTs; WeaWy similar to KIAA0437 (H^api 8.1 

128651 AA446990 Hs.103135 ESTs 6.5 

129088 AA215971 Hs.194431 KIAA0992 protein 52 

65 129136 N26391 Hs250723 ESTs 5.1 

129171 AA234048 Hs.7753 calumenin 5.8 

129229 AA211941 Hs.109643 potyadenytate binding proteln-toteractin 5.8 

129386 N27524 Hs260024 Cdc42 effector protein 3 52 

129467 AA410311 Hs.44208 ESTs 5.1 
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129564 H22136 Hs.75295 guanytate cyclase 1; soluble; alpha 3 163 

129699 AA45857B Hs.12017 K1AA0439 protein; homolog of yeast ubkju 95 

129821 F11019 Hs.12696 cortactin SH3 dornain-binding protein 8.6 

129823 X00948 Hs.105314 relaxin2(H2) 9.1 

5 129847 W46767 Hs596178 ESTs; WeaWy similar to RNA POLYMERASE I 5.4 

129912 AA047344 Hs.1 072 13 ESTs; Highly similar to NY-REN-6 antigen 65 

129958 150591 Hs.1378 anne«nA3 5.1 

129977 J04076 Hs.1395 early growth response 2 (Krox-20 (Drosop 8.6 

130061 U82256 Hs.172851 arglnase; type II 7.4 

10 130241 U78313 Hs.1 53203 MyoD family inhibitor 4.9 

130466 N21679 Hs.180059 ESTs 5.8 

130541 X05608 Hs511584 neurofilament; light polypeptide (68kD) 6.7 

130619 AA477739 Hs.12532 ESTs 6.4 

130925 N71935 Hs.169378 multiple PDZ domain protein 7.9 

15 130938 AA013250 Hs51398 ESTs; Moderately similar to PUTATIVE GLU 6.2 

130971 H20332 Hs.301444 signal sequence receptor, gamma (translo 6.4 

131066 F09006 Hs52588 ESTs 5 

131126 F09012 Hs.181326 myotubularin related protein 2 6.4 

131310 J02960 Hs5551 adrenergic; beta-2-; receptor, surface 7.9 

20 131487 AA253220 Hs57373 Homo sapiens mRNA; cONA DKFZp56401763 (f 5.9 

131561 X59841 Hs594101 pre-B<eP leukemia transcription factor 7.6 

131562 U90551 . Hs58777 H2Ahistone family; member U 5.1 
131579 N62922 HS59088 ESTs 11 
131629 AA442119 Hs5388Q9 ESTs 4.9 

25 131682 AA428368 Hs.30654 ESTs 4.8 

131699 R68657 Hs.90421 ESTs; Moderately similar to !!!! ALU SUB 6.5 

131795 N32724 Hs.32317 Sox-tike transcriptional factor 5.6 

132053 H93381 Hs.38085 ESTs; Weakly simflar to putative glycine 73 

132122 U65092 Hs.40403 Cbp/p300-interacting transactrvator; wit 5.6 

30 132191 AA449431 Hs588361 KIAA0741 gene product 8 

132256 AA608856 Hs.431 murine leukemia viral (bmi-1) oncogene h 55 

132482 AA429478 Hs538126 ESTs; Highly similar to CGI-49 protein { 6.6 

132533 AA021608 Hs.172510 ESTs 5.8 

132572 AA448297 Hs337825 signal recognition particle 72kD 65 

35 132581 R42266 Hs52256 ESTs; Weakly simflar to beta-TrCP protei 16 

132700 N47109 Hs5521 ESTs 6.8 

132701 AA279359 Hs55220 BCL2-associated athanogene 2 5.3 
132725 L41887 Hs.184167 splicing factor; arginine/serine-rich 7 7.8 
132783 N74897 Hs578894 DEAD/H (Asp-Glu-Ala-Asp/His) box polypep 5.9 

40 132790 X75535 Hs.168670 peroxisomal famesylated protein 8 

132939 U76189 Hs.61152 exostoses (multiple)-like 2 55 

133142 F03321 Hs.65874 ESTs 55 

133342 U29589 Hs.7138 cholinergic receptor; muscarinic 3 103 

133434 AA278852 Hs.30212 ESTs 5.8 

45 133453 M68941 Hs.73826 protein tyrosine phosphatase; non-recept 4.9 

133520 X74331 Hs.74519 primase; polypeptide 2A (58kD) 13.1 

133544 T33873 Hs.74624 protein tyrosine phosphatase; receptor I 4.6 

133608 D13315 Hs.75207 gryoxalasel 4.8 

133626 H75939 Hs.75277 Homo sapiens mRNA; cDNA DKF2p586M141 (fr 5 

50 133633 D21262 Hs.75337 nucleolar phosphoprotein p130 63 

133797 S66431 Hs.76272 retinoblastoma-binding protein 2 6 

133928 N34096 Hs.7766 ubiquitln-conjugating enzyme E2E 1 (homo 5.4 

134095 U47414 Hs.79069 cyclinG2 55 - 

134249 N89827 Hs.80667 RALBP1 associated Eps domain containing 65 

55 134321 AA418230 Hs.8172 ESTs 7 

134453 X70683 Hs.83484 SRY (sex determining region Y>box 4 4.7 

134542 X57025 Hs.85112 Insulin-like growth factor 1 (somatomedi 7.7 

134570 U66615 Hs.172280 SWl/SNF related; matrix associated; acti 6.4 

134592 U82613 Hs589104 Alu-binding protein with zinc finger dom 5.4 

60 134654 W23625 Hs.8739 ESTs; Weakly similar to ORF YGR200C (S.c 5 

134666 AA482319 Hs.8752 putative type II membrane protein 5.4 

134806 Z49099 Hs39718 spermine synthase 67 

134951 AA431480 Hs.169358 ESTs 9.8 

135066 X04602 Hs.93913 Interteukin 6 interferon; beta 2) 5.7 

65 135155 AA358268 Hs.1 66556 ESTs; Moderate ly similar to transcriptio 4.9 

135411 L10333 Hs39947 reticulonl 53 

300023 M10098 AFFXoontroh 18S ribosomalRNA 4.6 

300254 AW079607 Hs55610 ESTs; Weakly similar to ZnT-3 [H^apiens 7.8 

300273 AW013907 Hs.1 67531 ESTs; Moderately similar to predicted us 115 
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300319 AW157646 Hs.153506 ESTs; Weakly similar to microtubute-acti 8.5 

300566 H86709 Hs.326392 son of sevenless (Drosophila) homotog 1 5.8 

300578 AI989417 Hs.134289 ESTs 4.4 

300671 AI239706 Hs.93810 ESTs 7.9 

5 300675 AA039352 Hs.125034 ESTs; Weakly similar to ORF YDUMOc [S.c 43 

300680 AW468066 Hs£4817 ESTs; Weakly similar to KIAA0986 protein 5.2 

300762 AI497778 Hs.20509 ESTs 6.4 

300810 AI076890 Hs.146847 ESTs 5.8 

300813 AA406411 Hs.208341 ESTs; Weakly similar to KIAA0989 protein 10.6 

10 300823 AI863068 Hs.106823 ESTs; Weakly similar to putative zinc fi 5.6 

300834 AF109300 Hs.147924 ESTs 6.7 

300923 AW136372 Hs.1852 ESTs 7.6 

300962 AA593373 Hs293744 ESTs 5.5 

301015 AA947682 Hs.20252 ESTs; Weakly similar to Chain A; Cdc42hs 7 

15 301042 AI659131 Hs.197733 ESTs 24.9 

301242 AW161535 Hs.23762 ESTs 113 

301254 AI049624 Hs£83390 EST cluster (not in UniGene) with exon h 4.3 

301262 H29500 Ks.7130 ESTs; Moderately similar to N-copine |H. 4.3 

301388 AA156879 Hs.262036 ESTs; Weakly similar to ZINC FINGER PROT 6.6 

20 301563 AI802946 Hs.44208 ESTs; Weakly similar to match to ESTs AA 5.7 

301656 AW008475 Hs.151258 EST duster (not in UniGene) with exon h 6.8 

301689 Z44810 Hs.301789 ESTs; Weakly simflar to similar to Cele 6.3 

301783 AL046347 Hs.83937 Homo sapiens P AC clone DJ 11 59004 from 7p 6.2 

301805 AI800004 Hs.142846 ESTs; Weakly similar to MesPI (Mmusculu 8.5 

25 301846 R20002 Hs.6823 ESTs; Weakly similar to intrinsic factor 4.6 

301891 AF131855 Hs.279591 Homo sapiens done 25056 mRNA sequence 6.3 

302005 AI869666 Hs.123119 ESTs 36.8 

302056 AI457532 Hs.30488 ESTs; Moderately similar to ROSA26 AS (M. 9.5 

302067 H05698 Hs.222399 ESTs; Weakly similar to proteirvtyrosine 5.8 

30 302099 AL021 397 Hs. 1 37576 ribosomal protein L34 pseudogene 1 8.8 

302147 AB022660 Hs.t51717 KIAA0437 protein 5.9 

302214 AJ001454 Hs.1 59425 Homo sapiens mRNA for testican-3 4.3 

302236 AI128606 Hs.6557 zinc finger protein 161 4.3 

302358 D81150 Hs.322848 EST cluster (not in UniGene) with exon h 5.5 

35 302410 NM 004917 Hs.2 18366 EST cluster (not in UniGene) with exon h 265 

302486 AC003682 Hs.183512 multiple UniGene matches 82 

302582 NM.000522 Hs.249195 EST cluster (not in UniGene) with exon h 6.4 

302785 AA425562 Hs.1 1065 EST duster (not in UniGene) with exon h 5 

302792 AA343696 Hs.46821 ESTs; Weakly similar to putative [Rsapi 4.8 

40 302881 AA508353 Hs.105314 relaxin 1 (HI) 78.8 

302892 N58545 Hs.42346 histone deacetylase 3 8.5 

302970 AW1 18352 Hs.312679 EST cluster (not In UniGene) wilh exon h 7.4 

302977 AW263124 Hs.315111 EST cluster (not in UniGene) with exon h 5.5 

303029 AF199613 EST cluster (not In UniGene) with exon h 4.6 

45 303125 AF161352 Hs.111782 EST cluster (not In UniGene) with exon h 5.8 

303280 AI571580 Hs.170307 ESTs 4.3 

303306 AA215297 Hs.61441 EST cluster (not in UniGene) with exon h 6.4 

303309 AL134164 Hs.145416 ESTs 6.6 

303344 AA255977 Hs.250646 ESTs; Highly similar to ubiquitin-conjug 195 

50 303380 AA298471 Hs.326567 EST cluster (not In UniGene) with exon h 6.6 

303401 AA758552 Hs.309497 ESTs 6.8 

303525 AW516519 Hs£73294 ESTs 4.8 

303526 AA348111 Hs.96900 ESTs 12.1 - 
303540 AA355607 Hs.309490 ESTs; Weakly similar to MMSET type I [H. 8.2 

55 303572 AW338520 Hs£42540 ESTs 8.4 

303885 AW500106 Hs£3643 EST cluster (not In UniGene) with exon h 4.9 

303699 D30891 Hs.19525 EST cluster (not in UniGene) with exon h 15.7 

303702 AW500748 Hs.224961 ESTs;. Weakly similar to 73 kDA sub unit o 6.3 

303718 A1741397 Hs.1 14658 ESTs 4.6 

60 303722 AA521510 Hs.145010 ESTs 125 

303732 AW5024O5 Hs.125759 ESTs; Weakly similar to tumor suppressor 4.3 

303735 AA707750 Hs.169055 ESTs; Weakly similar to cis-Golgi matrix 5.4 

303752 AI017286 Hs.5957 EST cluster (not in UniGene) with exon h 5.3 

303753 AW503733 Hs.9414 ESTs 13 
65 303813 AI275850 Hs.1 14658 EST cluster (not in UniGene) with exon h 7.8 

304053 R00493 Hs.125565 transtocase of inner mitochondrial membr 4.8 

304218 N66373 Hs.27973 ESTs; Weakly similar to ZK354.7 [Celega 6 

305200 AA668128 Hs.45207 EST singleton (not in UniGene) with exon 5.7 

306716 AI024916 Hs^51354 ESTs 5.7 

126 



WO 02/30268 



PCTYUSOl/32045 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



Qfl7 Drift AftfiAIRR 
OUY 040 HWW 1 00 


CCT cinnkstnn /not in 1 Inifiono^ with OYfWl 


7.3 


9A7Q71 AlTAAftft^ 


tie 1147ft PCT ctnrtbafnn tnni In 1 IniftnnaA with oynn 
ns.0i4/D coi sngiaion \noi n uihvjuiiuj wiui waum 


5.4 


OUoOov AW0UUU4 


Up OlftAO PCT cfnnlotnn /nnt In 1 tniftonoA u?Hh nvon 

ns.uiouo coi suiyJoiixi ^nui ui uiuubjioj wtui uauii 


8.1 


30o3o2 Albiooiy 


Ue 4Aft7/Q PCT emntnhtn /rwvt in 1 lniAona\ unth ovAfl 
nS. 103/43 Col SuiyKJUJll \uQl Dl UUIUoNUJ WIUI UaUII 


c c 

O.J 


9rtQA99 A IQCOAfti 

300923 AJoooUOl 


MSxiyolo to is 


4.4 




Ue 9Q707 riH/>e.nmal nrntain I 1A 


4.5 


000071; A\A/n7ft9j»9 


Li* 0071 PCT elnnlohm fnnt In I ln?f^ona\ urith oyaa 
nS.yt 1 1 Co 1 sinyituun ynxji in uihuumuj wiui uauii 


7.4 


9ftOC7>l AWUOAftftAyl 

OU90'4 AVVZUoolW 


We 9ftftnn0 PQTc Woatrtu cimilar in till Al 1 1 CI 1RPAW) 


o 


jiuuyo mi aii » /ou 


Me 1AAR71 PCTc 


5 


91AAQQ AIRftftRAI 
OllKJyo R10O0O41 


He 1R10^4 PCTe 


11.6 




He IRRifiR PCTc 

ns. loowo cois 


5.8 


q<nocr A f 0 AO 1 AU. 
JlUOOO Alfc0£l4O 


u c lA^ftftq PCTc 
ns. i**oooy co is 


9.7 ' 


91AOCO AI79J1AA0 


Uc 197KQQ PCT Hncio r fnn\ in I Inifipno^ 


10.4 


HAylAQ AHIt977ft 


Me 1AC71A PCTc 

rio.i < «o/iu CO 15 


4.6 


01A491 AM9A997 
310431 f\\H&J££( 


Uc PCTc 

ns.i*n?ooo co » o 


72.9 


omc-70 aiai9Q9iqa 
31 0573 AVvtwiloU 


Uc 1ftft1A9 PCTc 

ns.iooiHt cois 


7.6 


"in coo Aiooomo 
31 0538 A133o01d 


Uc iAfAAR PCTc 


9.2 


31 0639 AWZoSOo* 


Uc 17C1ft9 PCTc 
MS. I/O ID*: COIS 


4.5 


0*0-707 AtAf9C9CQA 

31 0787 AWco^obv 


Uc 1A7R7A PCTc 
nS.I4/Df4 COIS 


4.9 


310816 AI9730OI 


Uc QOAQRtL PCTc 


7.6 


014AJT4 AICCCCCJO 

311251 AUwocC 


Uc 1Q7ftQfl PCTc 

ns.iy/oyo cois 


41.3 


311280 Al7o79o7 


Uc PCTc* \A7oaWu cimibr in VOflAR 1 no no nrn 


4J5 


311330 A1579524 


Uc 9A1 R9Q PCTc- MnHoratoh/ similar tA lltl Al II CI IR 
nS J-v 1 Dty CO 1 *>, WlOuclaloiy alllUidi VJ 111! ALU OU O 


4.6 


311515 AW136713 


HS ZoodZ Co 1 S 


O.o 


311574 AJ8248o3 


Ue911X9A PCTc 
hSxllA^U COIS 


48 


311587 A1828254 


U<*07iniG CCTe 

nSx/1019 to IS 


o.o 


311596 AI682088 


Uc 7QQ7C CCTe 

ns.f9o/5 to IS 


9fi4 

COM 


311631 AI809519 


It- t\~t* OO CCTe 

^ HSx713o to IS 


ft 4 


311686 AW025661 


U _ OilAAAA CCTp 

HS240090 cotS 


7 A 


311783 A1682478 


II, iOCOQ COT 

H5.13528 col 


A ft 
4.0 


311826 AA765470 


Ua OCAAO CCTa 

HS.85092 coTS 


ft 7 
0.1 


311853 AW014013 


Ua imAcc ceTe> 
HS. 107056 to IS 


£ 0 


311901 R16890 


U. 10*M9C CCT* 

HS.137135 co IS 


c; ft 

O.O 


itii AAA iiiUr 4 AP i 

311932 AW451654 


hs.257482 colS 


A 9. 
4.0 


a4a*p*a A k*trAArA 

312153 AA759250 


Hs.1 18625 cylOCnrome D-561 


1 1 


312182 AA8348QO 


li- QACOCJ CCT tli lelnt* /nnt in t lniflona\ 

nSJ2bZW to i ciusier \noi in umuenej 


1ft Q 
lu.57 


312242 AI380207 


lj_ 4qpA7£ CCTe 
HS.1252/0 COIS 


4 7 


312296 C01367 


1|_ 4 ty-t inn COT«» 

HS.127128 colS 


ft 0 

O.O 


312407 R46180 


Ua 4t44QC CCTe 

HS.153485 to IS 


ft 9 


312424 AA847398 


LI* 90-1007 CCTe 

HS^81997 colS 


4 fl 


312425 R49353 


HS293892 colS 


ft 9 


312480 R68651 


4 J iAn7 PpT- 

HS. 1449 97 colS 


Q ft 


312518 C17785 


Ua <JQO70Q CCTe 

nS.1o2/oo tOIS 


C 9, 


312521 AA033609 


• lis, OinCO 4 CCTe 

H&239oo4 co IS 




312527 AI695522 


Ua CCTe 

HS.191271 colS 


LI 


312539 AIQ04377 


u„ onnoCA CCTe 

HS200350 colS 


7 

f 


312546 A1623511 


Ua HDCC7 CCTe 

HS.1 18567 coTS 


ft 1 
0.1 


312563 AA976064 


Ua 4 OflQ 40 CCTe 

HS. 180842 coTS 


ft ft 
0.0 


312623 AA694607 


Hs.17d9oo co I ciusier [noi m uniuene; 




312857 AA772279 


HS.126914 ESTS 


c 
0 


312890 AIB13654 


HSJ957 coTS 


ft R 


312903 AA939266 


Hs-278626 tSTs 


7 7 
f.f 


312905 H92571 


Hs.2 34478 to IS 


ft ft 


312976 AA836271 


Hs.125830 ESTs 


4.0 . 


312983 A1079278 


HS269899 ESTS 


ft 1 

0.1 


312990 AAc4aul(J 


Uc 1 KAOOI PCT r4iictor fnnt in 1 Inlftanal 
nS. I04JOI CO 1 UUSlnl \IVJI III vMllVTOiiU/ 


7 


313035 N36417 


Hs.144928 ESTs 


6.3 


313166 AI801098 


Hs.151500 ESTs 


4.3 


313188 AI039702 


Hs.179573 collagen; type 1; alpha 2 


4.8 


313218 AA827805 


Hs.124296 ESTs 


5 


313226 A1200281 


Hs.123910 ESTs 


5.9 


313325 AI420611 


Hs.127832 ESTs 


4.6 


313326 A1088120 


Hs.1 22329 ESTs 


7.4 


313425 AA745689 


Hs.186838 ESTs; WeaWy similar to simBar to zinc 


6.3 


313499 AI2613S0 


Hs.146085 ESTs 


5.6 


313540 AI797301 


HSn>740 ESTs 


5.9 


313568 AW467376 


Hs.129640 ESTs 


4.3 


313569 AI273419 


Hs.135146 ESTs; WeaWy stmRar to ZK1058i5 [Ceteg 


4.6 


313603 AW468119 


Hs^87631 EST cluster (not in UniGene) 


6.8 



127 



WO 02/30268 



PCT/US01/32045 



313515 AW295194 Hs301997 OKFZP434N1 26 protein 52 

313625 AW468402 Hs254020 ESTs 7.8 

313634 AA688292 Hs337786 ESTs 4.4 

313635 AA507227 Hs.6390 ESTs 8.1 
5 313638 AI753075 Hs.104627 ESTs 6.7 

313670 C16690 Hs.23767 EST duster (not in UniGene) 4.4 

313671 W49823 Hs.104613 ESTs 4.4 
313676 AA861697 Hs.120591 EST cluster (not In UniGene) 13.4 
313703 AI161293 Hs.280380 ESTs; Weakly similar to K1AA0525 protein 10 

10 313712 AA768553 Hs.74170 ESTs 5.2 

313800 AW296132 Hs.55098 ESTs 5.4 

313979 A1535895 Hs221Q24 ESTs 4.3 

314121 AI732100 Hs.187619 ESTs 13.6 

314123 AW245993 Hs£23394 ESTs 6.4 

15 314171 AI821895 Hs.193481 ESTs 29.4 

314188 AL138431 Hs.164243 ESTs 45 

314219 AL036001 Hs.48376 ESTs 5.7 

314236 M743396 Hs.189023 ESTs 4.9 

314237 AA732359 Hs.96264 ESTs 4.4 
20 314284 AA731431 Hs293464 EST cluster (not in UniGene) 6.4 

314305 A1280112 Hs.125232 ESTs 5.3 

314343 AI754701 Hs328476 ESTs; Weakly similar to alternatively sp 6.2 

314530 A1052358 Hs.193726 ESTs 45 

314691 AW207206 Hs.136319 ESTs 17 

25 314695 AW502698 Hs.1 18152 ESTs 8.9 

314785 AI538226 Hs.32976 ESTs 9.4 

314801 AA481027 Hs.109045 ESTs; Weakly similar to ORF YGR245c (S.c 8 

314864 AA493811 Hs2d4G68 ESTs 6 

314907 AI672225 Hs222886 ESTs 19.3 

30 314916 AA548906 Hs.122244 ESTs 4.5 

314954 AA521381 Hs.187726 ESTs 53 

314981 AA524953 Hs£93334 ESTs 4.6 

315021 AA533447 Hs312989 EST duster (not in UniGene) 5.1 

315051 AW292425 Hs.163484 EST 155 

35 315052 AA876910 Hs.134427 ESTs 20 

315073 AW452948 Hs.257631 ESTs 5.3 

315084 AI821085 ESTs 82 

315214 AI915927 Hs34771 ESTs 5.4 

315220 AI420753 Hs.66731 ESTs 5.1 

40 31527B A1985544 Hs.12450 ESTs 53 

315282 AI222165 Rs.144923 ESTs 45 

315368 AW291563 Hs.104696 ESTs 6 

315369 AA764918 Hs.256531 ESTs 4.8 
315378 AI263393 Hs.145008 ESTs 6.2 

45 315379 AI378329 Hs.126629 ESTs 5.4 

315402 AW293424 Hs.75354 ESTs 5.1 

315442 AA977935 Hs.127274 ESTs 6.6 

315443 AW003416 Hs.160604 ESTs 55 
315528 R37257 Hs.184780 ESTs 8.1 

50 315593 AW198103 Hs.158154 ESTs 95 

315634 AA837085 Hs22Q585 ESTs 7.8 

315705 AW449285 Hs.313636 ESTs 8.9 

315707 AI418055 Hs.161160 ESTs 5.1 - 

315714 AA744015 Hs£98138 EST duster (not in UniGene) 6.1 

55 315740 T05558 Hs.156880 EST duster (not In UniGene) 6.8 

315762 AI391470 Hs.158618 ESTs 53 

315769 AA744875 Hs.189413 ESTs 5 

315843 AA679430 Hs.191697 ESTs 5.7 

315990 AI800041 Hs.1 90555 ESTs 9.2 

60 316012 AA764950 Hs.1 19898 ESTs 43 

316036 AA708016 Hs.190389 ESTs 5.9 

316055 AA693880 Hs.6947 EST duster (not in UniGene) 6.7 

316074 AW517542 Hs.293273 ESTs 55 

316100 AW203986 Hs.2 13003 ESTs 5.1 

65 316169 AI127483 Hs.120451 ESTs 82 

316442 AA760894 Hs.153023 ESTs 17.1 

316491 AA766025 Hs.1 86854 EST 4.6 

316504 AW135854 Hs.132458 ESTs 43 

316667 AW015940 Hs.232234 ESTs 7.6 
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316854 AA831215 Hs.159066 ESTs; Weakfy similar to predicted using 5.1 

316905 AW138241 Hs21Q846 ESTs 6.4 

317008 AW051597 Hs.143707 ESTs 4.4 

317019 AA864968 Hs.127699 ESTs 11 

5 317194 AW445167 Hs.126036 ESTs 133 

317224 056760 Hs.93029 ESTs 8.7 

317404 AI806867 Hs.126594 ESTs 8.7 

317501 AA931245 Hs.137097 ESTs 11.1 

317548 AI654187 Hs.195704 ESTs 142 

10 317651 AW292779 Hs.169799 ESTs 5.8 

317758 AI733277 Hs.128321 ESTs 5.4 

317850 N29974 Hs.152982 EST duster (not in UniGene) 11.4 

317869 AW295184 Hs. 129142 ESTs; Weakly similar to DEOXYRIBONUCLEAS 13.8 

317902 A!8286Q2 Hs211265 ESTs 5.3 

15 317916 AI565071 Hs.159983 ESTs 7.7 

318239 AI085198 Hs.164226 ESTs . 13.1 

318268 AI817736 Hs.182490 ESTs 62 

318327 AW294013 Hs200942 ESTs 4.6 

318363 R45530 Hs.1440 gamrna-aminobutyncacid{GABA) Arecepto 6 

20 318428 AI949409 Hs.194591 ESTs 123 

318464 AI151010 Hs.157774 ESTs 4.3 

318524 AW291511 Hs.159066 ESTs 25.9 

318540 T30280 Hs274803 EST cluster (not in UniGene) 7 

318591 AW206806 Hs.1 15325 ESTs 43 

25 318615 AI133617 Hs.10177 ESTs 53 

318646 AW175665 Hs278695 ESTs 5.7 

318667 AI493742 Hs.165210 ESTs 11 

318668 W26276 Hs.136075 ESTs 5.9 
318753 AA578265 Hs.7130 copinelV 53 

30 319080 Z45131 Hs23023 ESTs 16.9 

319181 F06504 Hs27384 EST cluster (not in UniGene) 4.6 

319191 AF071538 Hs.79414 prostate epithelium-specific Ets transcr 6.6 

319233 R21054 Hs.180532 ESTs 4.9 

319586 078808 Hs283683 ESTs 82 

35 319750 AA621606 Hs.1 17956 ESTs 9.3 

319763 AM60775 Hs£295 ESTs 14.3 

319824 AA424266 Hs.123642 EST cluster (not in UniGene) 12.8 

319838 AA337642 Hs.95262 nuclear factor related to kappa B bindrn 5.1 

319913 AA179304 H&271586 ESTs; Moderately similar to ALU SUB 4.3 

40 319964 T80579 Hs290270 ESTs 5.8 

320076 AI653733 Hs271593 ESTs 83 

320102 AW296219 Hs.1 15325 RAB7; member RAS oncogene family-like 1 9.8 

320187 T99949 Hs303428 EST cluster (not in UniGene) 9.8 

320211 ALQ394Q2 Hs.125783 DEME-6 protein 7.9 

45 320324 AF071202 Hs.139336 ATP-binding cassette; sub-family C (CFTR 562 

320455 R49889 H&24144 EST cluster (not in UniGene) 83 

320464 AI089817 Hs237146 ESTs 5.4 

320561 NMJJ06953 Hs.159330 EST cluster (not in UniGene) 7 

320574 AL049443 Hs.161283 Homo sapiens mRNA;cONAOKFZp586N2020(f 4.4 

50 320576 AL049977 Hs.162209 Homo sapiens mRNA; cDNA DKFZp564C122 (fr 6.7 

320654 AV\>263086 Hs.118112 ESTs 6 

320796 AF038966 Hs.31218 seaetory carrier membrane protein 1 133 

320800 AI681006 Hs.71721 ESTs 62 - 

320813 AW360847 Hs.16578 ESTs 93 

55 320853 AI473796 Hs.135904 ESTs 8.1 

320856 059945 Hs.65366 EST cluster (not in UniGene) 6 

320899 AA633772 Hs.116798 ESTs 92 

320918 AW195012 Hs293970 ESTs 5 

320973 H19732 HS247917 ESTs 5.9 

60 321099 AA018386 Hs.64341 ESTs 43 

321190 H52462 Hs.t63872 EST cluster (not in UniGene) 5.8 

321318 AB033041 Hs.137507 EST duster (not In UniGene) 8.4 

321382 AW372449 Hs.175982 EST duster (not in UniGene) 73 

321441 AW297633 Hs.118498 ESTs 14.7 

65 321538 H80483 Hs.46903 EST duster (no! m UniGene) 92 
321609 H86021 Hs.1 82538 ESTs; Weakly similar to hMmTRAlb (H.sapi 43 

321636 AI791838 Hs.193465 ESTs 53 

321638 AI356352 Hs.108932 ESTs 4.6 

321644 AI204177 Hs237396 ESTs 6.6 
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321681 AA233821 Hs.190173 EST duster (not in UniGene) 4.6 

321726 X91221 Hs.144465 EST cluster (not in UniGene) 5 

321758 U29112 Hs.196151 EST cluster (not in UniGene) 6.2 

321877 AL109784 Hs.189222 EST duster (not in UniGene) 4.6 

5 321899 N55158 Hs.29468 ESTs 4.6 

321902 AA746374 Hs.145010 ESTs 82 

322007 AW410646 Hs.164649 ESTs 5.1 

322055 AL137646 Hs.146001 EST duster (not in UniGene) 4.3 

322092 AF085833 Hs.135624 EST duster (not in UniGene) 4.3 

10 322221 AI890619 Hs.179662 nudeosome assembly protein Hike 1 4.4 

322278 AF086283 EST duster (not in UniGene) 5.8 

322303 W07459 Hs.157601 EST duster (not in UniGene) 22 

322437 AW393804 Hs.170253 ESTs; Weakly similar to rabaptin-4 [H.sa 4.4 

322493 AF143235 Hs.279819 EST duster (not in UniGene) 72 

15 322782 AA056060 Hs.202577 EST duster (not in UniGene) 18.4 

322811 AA782292 Hs.105872 ESTs 6.9 

322818 AW043782 Hs.293616 ESTs 10.7 

322826 AI807883 Hs.180059 ESTs 5 

322887 AI986306 Hs.86149 ESTs; Weakly similar to KIAA0969 protein 115 

20 322889 AA081924 Hs.124918 ESTs 7.1 

322924 AA669253 Hs.136075 ESTs 45 

322982 AI351191 Hs.128430 ESTs 6.6 

322994 AA422116 Hs.191461 ESTs 4.7 

323040 AA336609 Hs.10862 ESTs 6.9 

25 323041 AL1 18747 Hs.26691 EST duster (not in UniGene) 8.3 

323045 AA148950 Hs.188836 ESTs 4.6 

323048 AL1 18923 Hs.175110 EST duster (not hi UniGene) 7.5 

323070 AA157726 Hs.264330 ESTs 7.5 

323071 AA157867 Hs.5722 ESTs 47 
30 323097 Z44354 Hs.296261 guanine nucleotide binding protein (G pr 4.9 

323131 AA176982 Hs.270124 EST duster (not in UniGene) 6.1 

323136 AL120351 Hs.30177 EST duster (not in UniGene) 4.3 

323175 AI827137 Hs.336454 ESTs 6.2 

323218 AF131846 Hs.13396 Homo sapiens done 25028 mRNA sequence 6.3 

35 323226 AF055019 Hs.21906 Homo sapiens done 24670 mRNA sequence 12.6 

323236 AA363148 Hs.293960 ESTs 10.9 

323262 AI829770 Hs.190642 ESTs 7.6 

323276 AA836452 Hs.323822 ESTs 7.6 

323287 AA639902 Hs.104215 ESTs 24.7 

40 323335 AI655499 Hs.161712 ESTs 14.1 

323341 AL134875 Hs.108646 ESTs 5.3 

323362 AL135067 Hs.1 17182 ESTs 6.1 

323486 C05278 Hs.299221 ESTs; Moderately similar to [PYRUVATE DE 8.5 

323496 AI826801 Hs.300700 ESTs 4.5 

45 323507 H71721 Hs.128387 ESTs 4.4 

323545 AI814405 Hs.224569 ESTs 5.8 

323623 AA314280 Hs.146589 EST duster (not in UniGene) 5 

323663 AW263526 Hs.243023 ESTs 7.7 

323691 AA317561 Hs.145599 EST duster (not in UniGene) 5.9 

50 323810 AA740405 Hs.108806 ESTs 62 

323846 AA337621 Hs.137635 ESTs 6 

323929 AA354940 Hs.145958 ESTs 10.7 

323959 AI636775 Hs.6831 ESTs 5.4 - 

323996 AA367032 Hs.217882 ESTs 5.8 

55 323997 AA844907 Hs.274454 EST duster (not in UniGene) 4.4 

324019 AW1 77009 EST duster (not in UniGene) 4.6 

324130 AL046575 Hs.130198 ESTs 11 

324295 AI146686 Hs.143691 ESTs 13.7 

324296 AI524039 Hs.192524 ESTs 6.8 
60 324307 AA627642 Hs.4994 transducer of ERBB2; 2 (TOB2) 4.9 

324330 AA884766 EST duster (not in UniGene) 4.3 

324385 F28212 Hs.284247 EST duster (not in UniGene) 4.7 

324430 AA464018 Hs.184598 EST cluster (not in UniGene) 13.6 

324452 AW014022 Hs.1 70953 ESTs 7.6 

65 324547 AW501974 Hs.74170 ESTs 5.6 

324603 AW016378 Hs592934 ESTs 242 

324617 AA508552 Hs.195839 ESTs 54 

324618 AI346262 HsJ7159 ESTs 4.6 
324620 AA448021 Hs£4109 EST duster (not in UniGene) 5.7 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



324526 
324658 
324676 
324691 
324696 
324713 
324715 
324718 
324720 
324752 
324753 
324790 
324601 
324804 
324845 
324888 
324929 
324961 
325108 
326816 
326997 
327098 
328492 
329362 
329929 



AI685464 

AI694767 

AW503943 

AI217963 

AA641092 

AW340249 

AI739168 

AJ557019 

AA578904 

AI279919 

AA612626 

AI334367 

AI819924 

A1692552 

AA361016 

A1564134 

A1741633 

AA613792 

AA401863 



Hs.129179 
Hs.1 12451 
Hs.293341 
Hs.257339 
Hs.163440 
Hs.131798 
Hs.1 16467 
Hs.292437 
Hs.272072 
Hs.144871 
Hs.159337 
Hs.14553 

Hs.337533 
Hs.136102 
Hs.125350 

Hs£2380 



330020 
330211 
330384 
330430 
330546 
330551 
330658 
330700 
330704 
330705 
330706 
330712 
330725 
330732 
330762 
330763 
330772 
330786 
330892 
330949 
330977 
331017 
331099 
331128 
331151 
331195 
331320 
331321 
331337 
331348 
331359 
331383 
331422 
331442 
331466 
331479 
331490 
331493 
331561 
331615 
331659 
331696 
331811 



M23263 

HG2261-HT2352 
U31382 Hs£99867 
U39840 
AA319514 
AA037415 
AA056557 
AA102571 
AA121140 
AA167269 
AA252033 
AA281092 
AA449677 
AA450200 
AA479114 
D60374 
AA149579 
K01458 
H20826 
N24619 
R36671 
R51361 
R82331 
T64447 
AA262999 
AA278355 
AA287662 
AA400596 
AA416979 
AA454543 
F10802 
H77381 
N21680 
N27154 
N32912 
N34357 
N62780 
N92352 
W48868 
Z38907 
AA404500 



Hs.30732 
Hs.20999 
Hs.6759 
Hs.157078 
Hs.177576 
Hs.52620 
Hs.24052 
Hs.35254 
Hs.15251 
Hs.143187 
Hs.1 1356 

Hs.91202 
Hs.142896 
Hs.315181 
Hs.108920 
Hs.14846 
Hs.268714 
Hs.268838 
Hs.168439 
Hs.300141 
Hs.87929 
Hs.1 18630 
Hs.88143 
HsJ1897 
Hs.43543 
Hs.237339 
Hs.41223 
Hs.43455 
Hs.44076 
Hs591039 
Hs.93817 
Hs.48703 
Hs.5472 
Hs.334305 
Hs.65949 
Hs.1 87958 



ESTs 
ESTs 
ESTs 

ESTs; Weakly similar to Pro-a2(XI) [H.sa 

ESTs 

ESTs 

EST cluster (not in UniGene) 

ESTs 

ESTs 

ESTs; Moderately similar to !!!! ALU SUB 

EST cluster (not in UniGene) 

ESTs 

ESTs 

ESTs 

ESTs 

KIAA0853 protein 
ESTs 

EST cluster (not in UniGene) 
ESTs 

CH.20_hsgi|6552458 

CH.21_hsgi|5867660 

CH.21Jisgi[6682516 

CH.07_hsgi[5868455 

CH.X hsgi|5868B37 

CH.16_p2gi|6165201 

CH.16_p2gi|5091594 

CH.16_p2gi|6671887 

CH.05 _p2gi|6013592 - 

androgen receptor (dihydrotestosterone r 

Hs.321110 

guanine nucleotide binding protein 4 

hepatocyte nuclear factor 3; alpha 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs; Moderately similar to kynurenine a 
ESTs 

ESTs; Weakly similar to 111! ALU SUBFAMI 
ESTs 

Human DNA sequence from clone 437M21 on 

FK506-btnding protein 3 (25kD) 

ESTs 

EST 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs; Moderately similar to !!!! ALU SUB 

ESTs 

ESTs 

ESTs 

ESTs; Weakly similar to hypothetical 43. 

ESTs 

ESTs 

ESTs 

ESTs 

K1AA0888 protein 
ESTs 



9 

22 

4.9 

10.6 

105 

55 

12 

34.4 

4.8 

7.9 

52 

7.6 

12.6 

6.5 

4.5 

4.4 

6.5 

5.1 

7.1 

9.6 

4.8 

4.3 

5.8 

4.3 

5.5 

7.6 

6 

12.6 
9 

Antigen, Prostate Specific, Aft. Splice 
6 

4.9 
6 

5.5 

5.1 

11.7 

US 

5 

72 

4.9 

185 

4.3 

5.8 

4.6 

15.3 

10.3 

4.4 

11.8 

11.6 

4.8 

13 

4.9 

4.8 . 

6.1 

92 

9.9 

4.3 

4.6 

4.9 

7.5 

5.4 

65 

125 

4.6 

92 

4.6 

6.7 

10.3 

43 
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331 848 AA417039 Hs.98268 signal recognition particle 72 kD 7.5 

331B73 AM29445 Hs.98640 ESTs 6.5 

331889 AA431407 Hs.98802 Homo sapiens Chromosome 16 BAC clone CIT 33.6 

331967 AA460158 Hs.99589 KIAA1 028 protein 6.8 

5 331974 AA464518 Hs.105322 ESTs 5.3 

332043 AA490831 Hs.201591 ESTs 10.8 

332076 AA599477 Hs.291156 ESTs 4.4 

332173 F09281 Hs.100725 ESTs 5.5 

332247 N58172 ESTs 14.2 

10 332249 N62096 Hs.194140 ESTs 72 

332325 T79428 Hs.339667 ESTs 5.6 

332396 AA340504 ESTs; Weakly similar to similarto human 212 

332434 N75542 Hs237731 transcription factor 4 15.3 

332493 N95495 Hs.56729 ESTs; Highly similar to GTP-binding prot 7.1 

15 332522 L38503 Hs.1 78357 glutathione S-transferase thata 2 6.6 

332526 AA281753 Hs.17731 inositol 1;4;5-triphosphate receptor, ty 5.8 

332530 M31682 Hs.19280 inhibin; beta B (activin AB beta polypep 5.5 

332533 M99487 Hs.325825 folate hydrolase (prostato-spectTic memo 38.1 

332538 N48715 Hs.20991 ESTs 6.5 

20 332546 D84454 Hs22587 solute carrier family 35 (UDP-galactose 4.8 

332594 AA279313 Hs.32951 methyl CpG binding protein 2 5.6 

332610 AA412405 Hs.40513 ESTs; Weakly similar to BETA GALACTOS1DA 5.6 

332661 N95742 Hs.6390 ESTs 6.9 

332697 T948B5 Hs.75725 carboxypeptidase E 24.3 

25 332712 D26070 Hs.79306 inositol 1 ^triphosphate receptor; ty 9.9 

332716 L00058 Hs.79630 v-myc avian myelocytomatosis viral oncog 5.6 

332726 R72029 Hs.83428 synaptophysin-fike protein 5 

332781 AA233258 ESTs; Weakly similar to D 1007.5 [C.etega 4.5 

332797 CH22_FGENES.6_2 30.8 

30 332798 CH22 FGENESS.S 66.8 

332799 CH22_FGENES.6_6 19.8 

332933 CH22_FGENES.38_7 5.6 

332980 CH22_FGENES.54_1 5.5 

332984 CH22 FGENES.54 6 4.9 

35 333168 CH22 FGENES.94J 4.7 

333169 CH22.FGENES.94_2 4.4 

333452 CH22_FGENES.157_1 4.8 

333456 CH22_FGENES.157_5 4.3 

333458 CH22_FGENES.157_7 4.6 

40 333611 CH22_FGENES.2t7_6 4.7 

333621 CH22_FGENES219_5 5.5 

333814 CH22_FGENES.282_2 7.1 

333849 CH22_FGENES290_8 62 

333949 CH2£_FGENES.303_5 4.3 

45 333951 CH22_FGENES.303_7 4.9 

333955 CH22.FGENES.3Q3J1 5.6 

334150 CH22_FGENES.339J 5.1 

334223 CH22_FGENES.360_4 20.3 

334297 CH22_FGENES.372J3 9.4 

50 334443 CH22_FGENES.387_2 4.6 

334444 CH22_FGENES.387j> 5.6 

334447 CH22_FGENES.387_7 13.1 

334570 CH22_FGENES.405_1 1 5.4 - 

334749 CH22 FGENES.427J 5.3 

55 334777 CH22__FGENES.430_9 4.7 

334960 CH2LFGENES.465J9 52 

335179 CH22_FGENES.504_9 8.8 

335293 CH22_FGENES-527_6 4.7 

335550 CH22.FGENES.576J1 5.1 

60 335581 CH22J=GENES.581_19 5.7 

335586 CH22 FGENES581_25 4.3 

335809 CH22_FGENES.617_6 62 

335B10 CH22_FGENES.617_7 5.8 

335822 CH22_FGENES.619_7 7.1 

65 335824 CH22 FGENES.619J1 8.5 

335853 CH22 FGENES.626.5 4.3 

335886 CH22_FGENES.632_4 4.3 

336034 CH22 FGENES.678_5 6.8 

336441 CH22_FGENES.827_7 7 Jo 
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336624 


CH22 FGENES.6-3 


43.3 


336625 


CH22.FGENES.6-4 


37.9 


336679 


CH22_FGENES.437 


5.3 


337577 


CH22_C65E1.GENSCAN.8-1 


45 


338255 


CH22_EMAC005500.GENSCAN.276-3 


13,4 


338260 


CH22_EMiAC005500.GENSCAN.279-1 0 


4.6 


338561 


CH22_EM^C005500.GENSCAN.421-5 


4.6 


338562 


CH22_EM^C005500.GENSCAN.421-6 


4.3 


338759 


CH22 EM:AC005500.GENSCAN.517-6 


5.1 


338763 


CH22 EM:AC005500.GENSCAN.517-16 


5.5 


338764 


CH22.EM^C005500.GENSCAN.517-17 


7.1 
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TABLE 3A shows the accession numbers for those primekeys lacking unigeneBD's for Table 
3. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset identifier number 

CAT number. Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 

123619 371681 1 AA602964 AA609200 

116722 143512 1 Z24878 AA494098 F13654 AA494040 AA143127 

103677 41847J Z83806AJ1 32091 AJ1 32090 

125992 1589048J H48372 W01626 

109342 genbank_AA213620 AA213620 

125154 genbankJV38419 W38419 

101447 entre^_M213Q5 M21305 

124357 genbank_N22401 N22401 

108910 genbank_AA136590 AA136590 

322278 47271.1 W69304 AF086283 W69200 

315084 350959 1 AI821 085 AW973464 AA554802 AI82 1831 AA657438 AA640756 AA650339 

324019 262792J AW1 77009 AI381 610 

324330 300543.1 AA884766 AW974271 AA592975 AA447312 

324626 33641 1J AI685464 AW971336 AA513587 AA525142 

303029 37699 1 AF199613AF108756 

324804 398093 1 AI692552 AI393343 AI800510 AI37771 1 F24263 AA661876 

324961 376239.1 AA613792 AW1 82329 T05304 AW858385 

329362 CJLhs 

336624 CH22_4071FG,6_3_ 

336625 CH22J072FG_6_4„ 
336679 CH22J157FGJJ3JL 

338255 CH22_6856FG„UNK_EM:AC00 
338260 CH22 6863FG_LINK_EM:AC00 
329929 C16.D2 
329960 c16j>2 

338561 CH22_7294FG_LINK_EM:AC00 

338562 CH22 7295FG_JJNK_EM:ACOO 
338759 CH22~7581FG_JJNK_EM:AC00 

338763 CH22 7585FG_JJNK_EM:AC00 

338764 CH22 7586FG_UNK_EM:AC00 

333168 CH22_400FG_94_1JJNK.EM:A 

333169 CH22 401FG_94_2_LINK_EM:A 
333452 CH22_702FGJ57_1_LINICEM: 
333456 CH22 706FGJ57.5_UNK.EM: 
333458 CH22_708FG_157_7_UNK_EM: 
333611 CH22 872FG_217_6_L!NK_EM: 
333621 CH22_882FG_219_5_UNK_EM: 
333814 CH22 1083FG_282_£_LINK-EM 
333849 CH22 1118FG_290_8_LINK_EM 
335179 CH22 2515FG__504_9JJNK_EM 
333949 CH22J225FG.303_5JJNK_.EM 
333951 CH22 1227FG_303_7_UNK EM 
333955 CH22 1231FG.303J 1 JJNK.E 
335293 CH22 2635FG_527_6_UNK_EM 
326816 c20_hs 

326997 c21_hs 

335550 CH22 2905FG_576J1_L!NK_E 
335581 CK22_2938FG_581J9JJNK_E 
335586 CH22_2944FG_581.25_LINK_E 
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328492 C_7_hs 

335809 CH22.3181FG_617_6_LINK.EM 

335810 CH22_3182FG_617_7JJNK_EM 
335822 CH22_3195FG_619_7_UNK_EM 
335824 CH22_3197FG_619_11JJNK_E 
335853 CH22.3228FQ_626_5_UNK.EM 
335886 CH22_3261FG_632_4_UNK_EM ' 
330020 c16_p2 

330211 C_5_p2 

337577 CH22_5864FG_JJNK_C65E1.G 
307848 AI3S4186 

332797 CH22 13FG.6 2_UNK_C4G1.G 

332798 CH22 14FG_6_5_UNK_C4G1.G 

332799 CH22 15FG_6 6JJNK.C4G1.G 
334150 CH22 1429FG_339_1 UNK_EM 
332933 CH22~154FG_38_7_UNK_C20H 
332980 CH22 204FG 54J_UNK_EMrt 
332984 CH22 208FG_54_6_UNK_EM:A 
334223 CH22_1507FG_360_4_LINieEM 
334297 CH22J588FG_372_3_LINK.EM 
327098 C21_hs 

334443 CH22 1742FG 387_2_LINK_EM 

334444 CH22J743FG_387_4_UNK_EM 
334447 CH22J746FG_387_7_UNK_EM 
334570 CH22 J875FG.405J 1_UNK_E 
334749 CH22_2061FG_427_1_LINK_EM 
334777 CH22_2089FG_430_9_LINK_EM 
336034 CH22_3419FG_678_5_UNK_DJ 
334960 CH22_2281FG_465_29JJNK_E 
336441 CH22_3861FG_827_7_LINK_OJ 

330551 9851 2 U39840 NM.004496 AW135607 BE087458 BE087567 AA1771 16 AW195705 AW750756 A1811008 A1694151 

BE348594 AW971075 A1347950 AI201455 AI073898 AA652680 AA613671 AI318364 AA507550 AA693692 
AI032599 AA991871 AI269801 AW948974 T74639 AA532907 AW949173 

330786 53973_3 BE379594 AI192455 AL039862 AI744012 AI761735 AW243181 AI743687 AI928223 AI423022 AI627855 

AI636059 AI651571 AW802044 AI826995 AI431733 AI539125 AA863056 AW270910 AI768930 AW008835 
AW615183 AW591 147 AI695294 AI672106 AA506358 AI308060 AA01 1556 AA962437 AI935488 BE219625 
AI004356 AW151394 A1218466 N66178 AI419784 AW242519 AW946907 D60374 AA989263 AI698799 
AA470460AI824167 

332247 372969J AA669097AA513815 AA026798AA676526AA704429AA704269AW118292 AA579216 N58172 

332396 20265 1 AW579842 BE156562 BE156690 BE156489 BE081033 AK001559 BE149402 M85387 AW367811 AW367798 

R17370 AI908947 AA382932 R58449 H18732 AA371231 AW962899 AA713530 AW892946 R53463 H1 1063 
AW068542 Z40761 BE176212 BE176155 W23952 W92188 AW374883 AA303497 AW954769 AA036808 
BE168063 AW382073 AW382085 AL041475 H80748 A1078161 BE463983 AI805213 AI761264 W94885 
N94502 A1623772 AI419532 AI810302 AI634190 AW002516 AW150777 AI352312 AI367474 AW204807 
AI675502 AI337026 AW134715 BE32B451 AI123157 AI560020 AI300745 AI608631 Ai248873 AA742484 
AW051635 H18646 AI245045 AA5071 1 1 AI640510 AJ925594 AA1 15747 AA143035 AA151 106 
332781 32044 1 AK001764 BE313896 AA380199 AA380151 AA194996 AW1 18089 AA495871 AW975219 AW085598 

AI378909 AW992310 AW992409 AI91 1857 AA657643 AI804471 AI242589 A1623968 R09556 AJ129100 
AI206500 AA680094 AA677784 A1023178 A1277519 AA424742 AI240654 AA232846 A1804273 A1382376 
AA001729 W90790 BE090656 AW295015 A1674596 AI431734 AI420517 AW769185 AI128355 AI192474 
AI8200G1 AA001929 AA706925 AI076676 A14991 19 AI200493 AI695919 AI376217 W69195 W69261 
AW305099 W90320 BE048357 AI658856 AA838534 AA233258 AI753393 AA709227 AI674387 AI872616 
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TABLE 3B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 3. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey; Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gi) numbers. •Dunham I. et at" refers to the 

pubBcation entitled The DNA sequence of human chromosome 22* Dunham L et at, Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

333611 Dunham, I. etal. 

333621 Dunham,!, etal. 

333814 Dunham, I. etal. 

333849 Dunham, I. etal. 

333949 Dunham, I. etal 

333951 Dunham, I. etal. 

333955 Dunham, I. etal. 

334150 Dunham, I. etal. 

334297 Dunham, I. etal. 

334443 Dunham, I. etal. 

334444 Dunham, I. etal. 
334447 Dunham, I. etal. 
334570 Dunham, I. etal. 
334777 Dunham, I. etal. 
335179 Dunham, L etal. 
335581 Dunham, I. etal. 
335586 Dunham, I. etal. 

335809 Dunham, I. etal. 

335810 Dunham, I. etal. 
335822 Dunham, I. etal. 
335824 Dunham, I. etat 
335886 Dunham, I. etal. 
336034 Dunham, t etal. 
336441 Dunham,!, etal. 
337577 Dunham, I. etal. 
338260 Dunham,!. etat 

332797 Dunham,!. etat 

332798 Dunham, t etal. 

332799 Dunham, t etal. 
332933 Dunham, t etat 
332980 Dunham, I. etat 
332984 Dunham, I. etat 

333168 Dunham, t etat 

333169 Dunham, I. etal. 
333452 Dunham, I. etal. 
333456 Dunham, tetaJ. 
333458 Dunham, I. etat 
334223 Dunham, tetat 
334749 Dunham, tetat 
334960 Dunham, I. etal. 
335293 Dunham, tetat 
335550 Dunham, t etat 
335853 Dunham, t etal. 

336624 Dunham, I. eta!. 

336625 Dunham, I. etal. 
336679 Dunham, I. etal. 
338255 Dunham, tetat 

338561 Dunham, t etal. 

338562 Dunham, I. etal. 
338759 Dunham, teUt 

338763 Dunham, tetat 

338764 Dunham, I. etat 



Strand 


NLposition 


Pius 


6548368-6548507 


Plus 


8597414-8597560 


Pius 


7894165-7894252 


Plus 


8018323-8018472 


Plus 


8589634-8589791 


Plus 


8592501-8592637 


Plus 


8597414-8597560 


Pius 


10529221-10529854 


Pius 


13420934-13421058 


Plus 


14298981-14299056 


Pius 


14306433-14306492 


Plus 


14308764-14308824 


Plus 


14994868-14994943 


Plus 


16259586-16260166 


Plus 


21634405-21634526 


Plus 


24976198-24976334 


Pius 


24990333-24990497 


Plus 


26310772-26310909 


Plus 


26314767-26314849 


Plus 


26364087-26364196 


Pius 


26376860-26376942 


Plus 


26934235-26934364 


Plus 


29014404-29014590 


Plus 


34187606-34187663 


Plus 


595377-595678 


Plus 


15458919-15459257 


Minus 


216964-216798 


Minus 


232147-231974 


Minus 


232421-232307 


Minus 


2035790-2035681 


Minus 


5136165-5136019 


Minus 


2632606-2632457 


Minus 


3729896-3729788 


Minus 


3730864-3730767 


Minus 


5136165-5136019 


Minus 


2631933-2631797 


Minus 


5143942-5143806 


Minus 


12734365-12734269 


Minus 


16090686-16090106 


Minus 


20160968-20160795 


Minus 


22316408-22316275 


Minus 


24668714-24668658 


Minus 


26614629-26614506 


Minus 


227714-227577 


Minus 


229124-229024 


Minus 


2035790-2035681 


Minus 


15242294-15242231 


Minus 


22311966-22311856 


Minus 


22312594-22312465 


Minus 


26582475-26582199 


Minus 


26628148-26628009 


Minus 


26641232-26641101 
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329960 5091594 

329929 6165201 

330020 6671887 

326816 6552458 

326997 5867660 

327098 6682516 

330211 6013592 

328492 5868455 

329362 5868837 



Minus 1031-1162 

Minus 156410-156553 

Plus 172397-172491 

Plus 198354-198436 

Minus 71389-72147 

Minus 1061684-1062361 

Phis 59158-59215 

Minus 46094-46241 

Minus 65688-68173 



137 



WO 02/30268 



PCT/US01/32045 



TABLE 4: shows a preferred subset of the Accession numbers for genes found in Table 3 
which are differentially expressed in prostate tumor tissue compared to normal prostate 
tissue. 



Pkey. Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: Ratio of tumor to normal body tissue 



Pkey 


ExAccn 


UnlgeneiD Unigene Title 


R1 


luuoiy 


HG4Q20-HT4290Hs2387 Transglutaminase 


in ^ 




U75272 


Hs.1 867 progastricsin (pepsinogen C) 


10.6 




X02544 


Hs.572 orosomucoid 1 


22.6 


lOoo/U 


AA236476 


Hs.22791 ESTs; Weakly similar to transmembrane pr 


10.3 


lUot)45 


AA282138 


Hs.11325 ESTs 


14 


100034 


AA4 19461 


Hs23317 ESTs 


105 


1U»U14 


AA1 56790 


Hs262036 ESTs 


15.3 


4 An ceo 


F01811 


Hs.1 87931 ESTs; Moderately similar to voltage-gate 


10.8 


113021 


T23855 


Hs.129836 KIAA1028 protein 


10.8 


114124 


Z38595 


Hs.125019 ESTs; Highly similar to KIAA0886 protein 


215 


122791 


AA460158 


Hs.129836 KIAA1028 protein 


12.4 


124352 


N21626 


Hs.102406 ESTs 


102 


301042 


AI659131 


Hs.197733 ESTs 


24.9 


302005 


AI869666 


Hs.123119 ESTs 


365 


302410 


NMJXW917 


Hs218366 EST cluster (not in UniGene) with exon h 


265 


302881 


AA508353 


Hs.105314 relaxin 1 (HI) 


785 


303344 


AA255977 


Hs25Q646 ESTs; Highly similar to ubiquitin-conjug 


195 


303753 


AW503733 


Hs.9414 ESTs 


13 


310431 


AI420227 


Hs.149358 ESTs 


72.9 


311251 


AI655662 


Hs.197698 ESTs 


41.3 


311596 


AI682088 


Hs.79375 ESTs 


26.4 


312153 


AA759250 


Hs.1 18625 cytochrome b-561 


11 


312521 


AA033609 


Hs239884 ESTs 


115 


313070 


AA861697 


Hs.1 20591 EST cluster (not in UniGene) 


13.4 


314171 


AI821895 


Hs.193481 ESTs 


29.4 


314907 


Ai 572225 


Hs222886 ESTs 


19.3 


315051 


AW292425 


Hs.163484 EST 


155 


315052 


AA876910 


Hs.134427 ESTs 


20 


317548 


A1654187 


Hs.195704 ESTs 


142 


317869 


AW295184 


Hs.129142 ESTs; Weakly similar to DEOXYRIBONUCLEAS 13.8 


318428 


AI949409 


Hs.194591 ESTs 


125 


318524 


AW291511 


Hs.159066 ESTs 


25.9 


319080 


Z45131 


Hs.23023 ESTs 


165 


319763 


AA460775 


Hs.6295 ESTs 


14.3 


320324 


AF071202 


Hs.1 39336 ATP-binding cassette; sub-tamily C (CFTR 


562 


321441 


AW297633 


Hs.1 18498 ESTs 


14.7 


322303 


W07459 


Hs.157601 EST duster (not in UniGene) 


22 


322782 


AA056060 


Hs2Q2577 EST duster (not in UniGene) 


18.4 


322818 


AW043782 


Hs293616 ESTs 


10.7 


323287 


AA639902 


Hs.104215 ESTs 


24.7 


324603 


AW016378 


Hs292934 ESTs 


242 


324617 


AA508552 


Hs.195839 ESTs 


54 


324658 


A1694767 


Hs.129179 ESTs 


22 


324691 


AI217963 


Hs293341 ESTs;Weak!ysimflartoPro-a2(XI)[H.sa 


10.6 


324696 


AA641092 


Hs257339 ESTs 


102 


324718 


AI557019 


Hs.1 16467 ESTs 


34.4 


330211 




CH.05_p2gil6013592 


125 


330430 


HG2261-HT2352 Hs521 1 10 Antigen, Prostate Specific AIL SpBce 


135 


330706 


AA121140 


Hs.177576 ESTs; Moderately similar to kynurenlne a 


145 


330762 


AA449677 


Hs. 15251 Human DNA sequence from done 437M21 on 185 


330892 


AA149579 


Hs.91202 ESTs 


155 


330949 


K01458 


Hs.142896 ESTs 


105 
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331099 R36671 


Hs.14846 ESTs 


11.6 


331151 R82331 


Hs£68838 ESTs 


13 


331889 AA431407 


Hs.98802 Homo sapiens Chromosome 16 BAC dona CfT 33.6 


332247 N58172 


ESTs 


14.2 


332396 AA340504 


ESTs; Weakly similar to stmilarto human 


21.2 


332533 M99487 


Hs.325825 folate hydrolase (prostate-specific memo 


38.1 


332697 T94885 


Hs.75725 carboxypeptidase E 


24.3 


332797 


CH22 FGENES.6 2 


30.B 


332798 


CH22 FGENES.6 5 


66.8 


332799 


CH22_FGENES.6 6 


19.8 


334223 


CH22_FGENES.360_4 


203 


336624 


CH22 FGENES.6-3 


43.3 


336625 


CH22_FGENES.64 


37.9 
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TABLE 4A shows the accession numbers for those primekeys lacking unigenelD's for Table 
4. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset identifier number 

CAT number: Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number 



Accession 



336524 CH22_4071FG_6_3_ 
336625 CH22_4072FG_6_4_ 
330211 C_5_p2 

332797 CH22_13FG_6_2^UNK_C4G1.G 

332798 CH22J4FG_6_5JJNK_C4G1.G 

332799 CH22J5FG_6_6_LINK_C4G1.G 
334223 CH22J507FG_360_4_UNK_EM 
332247 372969J 

332396 20265J 



AA669097 AA513815 AA026798 AA676526 AA704429 AA704269 AW118292 AA579216 N58172 
AW579842 BE156562 BE156690 BE1 56489 BE08 1033 AK001559 BE149402 M85387 AW367811 
AW367798 R17370 AI908947 AA382932 R58449 H18732 AA371231 AW962899 AA713530 AW892946 
R53463 H11063 AW068542 Z40761 BE176212 BE176155 W23952 W92188 AW374883 AA303497 
AW954769 AA036808 BE168063 AW382073 AW382085 AL041475 H80748 AI078161 BE463983 
AI805213 AI761264 W94885 N94502 AI623772 AI419532 AI810302 AI634190 AW002516 AW150777 
AI352312 AI367474 AW204807 AI675502 AI337026 AW134715 BE328451 AI123157 AI560020 
AI300745 AI608631 AI248873 AA742484 AW051635 H18646 AI245045 AA507111 AI640510 AI925594 
AA1 15747 AA143035 AA151 106 
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TABLE 4B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 4. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Pkey. Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digij numbers In this cotumn are Genbank identifier (GJ) numbers. "Dunham L ei aJ." refers to the publication entitled The 

DMA sequence of human chromosome 22.' Dunham I. et aL, Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which axons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

332797 Dunham, I. etaL 

332798 Dunham, I. eLaL 

332799 Dunham, I. etal. 
334223 Dunham, I. etal. 

336624 Dunham, I. etal. 

336625 Dunham, I. etal 
330211 6013592 



Strand 


NLposition 


Minus 


216964-216798 


Minus 


232147-231974 


Minus 


232421-232307 


Minus 


12734365-12734269 


Minus 


227714-227577 


Minus 


229124-229024 


Plus 


59158-59215 
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10 



TABLE 5: 1170 GENES UP-REGULATED IN PROSTATE CANCER COMPARED TO 

NORMAL ADULT TISSUES 

Table 5 shows 1 170 genes up-regulated in prostate cancer compared to normal adult tissues. 
These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip array such 
that the ratio of "average" prostate cancer to "average" normal adult tissues was greater than 
or equal to 3.44. The "average" prostate cancer level was set to the 85 th percentile amongst 73 
prostate cancers. The "average" normal adult tissue level was set to the 85 th percentile 
amongst 162 non-malignant tissues. In order to remove gene-specific background levels of 
non-specific hybridization, the 7.5 Ul percentile value amongst the 162 non-malignant tissues 
was subtracted from both the numerator and the denominator before the ratio was evaluated. 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Pkey: 




Unique Eos probeset identifier number 




ExAccn: 




Exemplar Accession number, Genbank accession number 




UnigenelD: 




Unigene number 






Unigene Title: 


Unigene gene title 




R1: 




Ratio of tumor to normal tissue 




Pkey 


ExAccn 


UnigenelD 




R1 


446057 


AI420227 


Hs.149358 


pcTc Wealths similar tn JUfiflm X.linkpri 


86.42 


400302 


N48056 


Hs.1915 


folate hvdrotasa {nmjtfate-SDGcificmemb 


66.46 


414569 


AF109298 


Hs.1 18258 


prostate cancer associated protein 1 


58.36 


417407 


AA923278 


Hs290905 


ESTs, Weakly similar to protease [H.sapi 


56.16 


431579 


AW971082 


Hs222886 


ESTs, Weakly similar to TRHYJWMAN TRICH 


53.38 


409361 


NM.005982 Hs.54416 


sine ocuBs homeobox (Drosophila) homolo 


4828 


409731 


AA1259B5 


Hs36145 


thymosin, beta, identified in neuroblast 


4524 


400298 


AA032279 


Hs.61635 


six transmembrane epithelial antigen of 


43.48 


420154 


A1093155 


Hs.95420 


JM27 protein 


41.12 


433466 


AA508353 


Hs.105314 


relaxin 1 (HI) 


3938 


400296 


AA305627 


Hs.139336 


ATP-blnding cassette, sub-family C {CFTR 


38.42 


400292 


AA250737 


Hs.72472 


ESTs 


38.00 


432887 


AI926047 


Hs.162859 


ESTs 


36.48 


439176 


AM46444 


Hs.190394 


ESTs, Weakly simitar to B28096 line-1 pr 


36.45 


430722 


AW968543 


Hs2G3270 


ESTs, Weakly similar to ALU1.HUMAN ALU S 


3320 


437052 


AA861697 


Hs.120591 


ESTs 


33.02 


418396 


AI765805 


Hs26691 


ESTs 


32.68 


434036 


AI659131 


Hs.197733 


hypothetical protein MGC2849 


32.44 


407709 


AA456135 


H&23023 


ESTs 


32.10 


426747 


AA535210 


Hs.171995 


kaiiikrein 3, (prostate specific antigen 


3130 


407168 


R45175 




ESTs 


31.72 


440260 


AI972867 


Hs.7130 


co pin 9 IV 


3052 


421513 


X00949 


Hs.105314 


relaxin 1 (HI) 


30.10 


416370 


N90470 


Hs203697 


ESTs, Weakly similar to 138022 hypotheti 


29.68 


407122 


H20276 


Hs.31742 


ESTs 


2924 


400287 


S39329 


Hs.181350 


kaiiikrein 2, prostatic 


26.90 


432244 


A1669973 


Hs200574 


ESTs 


28.74 


451939 


U80456 


Hs27311 


single-minded (DrosophBa) homolog 2 


28.74 


415989 


AI267700 


Hs.111128 


ESTs 


2834 


418961 


AW967646 


Hs23023 


ESTs 


2734 


425628 


NM_004476 Hs.1915 


folate hydrolase (prostate-specific memb 


2732 


458509 


AA654650 


Hs282906 


ESTs 


2724 


448290 


AK002107 


Hs .20343 


Homo sapiens cDNA FU11245 lis, clone PL 


27.16 


428336 


AA503115 


Hs.183752 


microsemlnoprotein, beta- 


26.17 


450096 


AI682088 


Hs223368 


holocarboxytase synthetase (biotin-lprop 


2530 


400299 


X07730 


Hs.171995 


kaJCkretn 3, (prostate specific antigen 


2431 


437571 


AA760894 


Hs.153023 


ESTs 


24.74 


453160 


AI263307 


Hs.146228 


H2B histone family, member L 


2436 


453096 


AW294631 


Hs.11325 


ESTs 


24.46 


425075 


AA506324 


Hs.1852 


add phosphatase, prostate 


2423 


407202 


N5B172 


Hs.109370 


ESTs 


24.18 
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424846 AU077324 Hs.1832 neuropeptide Y 2357 

453370 AI470523 Hs.182356 ATP-bindfog cassette, sub-family C (CFTR 23.16 

422805 AA438989 Hs.121017 H2A histone famSy, member A 2252 

444917 R68651 Hs.144997 ESTs 2226 

5 408826 AF216077 Hs.48376 Homo sapiens done HB-2 mRNA sequence 22.02 

413597 AW302885 Hs.117183 ESTs 21.76 

426429 X73114 Hs.169849 myosin-binding protein C, slow-type 21.32 

435981 H74319 Hs.188620 ESTs 21.12 

432966 AA650114 ESTs 21.07 

10 418848 AI820961 Hs.193465 ESTs 21.06 

405685 20.90 

443271 . BE568568 Hs.195704 ESTs 19.98 

418819 AA228776 Hs.191721 ESTs 1954 

420757 X78592 Hs.99915 androgen receptor (dihydrotestosterone r 19.72 

15 418994 AA296520 Hs.89546 selectin E (endothelial adhesion motecul 19.56 

429918 AW873986 Hs.1 19383 ESTs 19.04 

415539 AI733881 Hs.72472 ESTs 18.43 

450382 AA397658 Hs.60257 Homo sapiens cDNA FU13598 lis, clone PL 18.34 

418829 AA516531 Hs.55999 NK homeobox (Drosophila), family 3, A 1828 

20 429984 AL050102 Hs227209 hypothetical protein FU21 61 7 17.82 

443822 AI087412 Hs.1 43611 ESTs, Weakly similar to 2 00439 9 A chromos 17.66 

431676 AI685464 Hs292638 gb:tt88f04jc1 NCLCGAP_Pr28 Homo sapiens 17.64 

410330 AW023630 Hs.46786 ESTs 1752 

432441 AW292425 Hs.163484 ESTs 17.41 

25 452792 AB037765 Hs.30652 KIAA1344 protein 1759 

445472 AB006631 Hs.12784 Homo sapiens mRNA tor KIAA0293 gene, par 17.00 

414565 AA502972 Hs.183390 hypothetical protein RJ13590 16.82 

430487 D87742 Hs241552 KIAA0268 protein 16.72 

431716 D89053 Hs268012 fatty-acid-Coenzyme A Rgase, long-chain 16.60 

30 419536 AA603305 gb:np12d11.s1 NCI CGAP Pr3 Homo sapiens 1650 

439677 R82331 Hs.164599 ESTs 16.46 

449625 NM.014253 Hs23796 odz {odd Oz/ten-m, Drosophila) homolog 1 1652 

408430 S79876 Hs.44926 dipeptJdylpeptidase IV (CD26, adenosine 1628 

447033 AI357412 Hs.157601 ESTs 16.02 

35 453006 AI362575 Hs.167133 ESTs 15.74 

431474 AL1 33990 Hs.190642 ESTs 15.70 

420218 AW958037 Hs22437 ribosomal protein L4 15.64 

408000 L11690 Hs.620 bullous pemphigoid antigen 1 (23Q/240kD) 1554 

416208 AW291168 Hs.41295 ESTs, WeaWy similar to MUC2„HUMAN MUCIN 15.48 

40 430226 BE245562 Hs2551 adrenergic, bela-2-, receptor, surface 15.40 

415263 AA948033 Hs.130853 ESTs 15.38 

432437 W07088 Hs293685* ESTs 1526 

428398 AI249368 Hs.98558 ESTs 1521 

429900 AA460421 Hs.30875 ESTs 14.90 

45 449156 AF103907 Hs.171353 prostate cancer antigen 3 1439 

411095 U80034 Hs.68583 mitochondrial intermediate peptidase 1451 

435974 . U29690 H&37744 Homo sapiens beta-1 adrenergic receptor 14.76 

444484 AK002126 Hs.1 1260 hypothetical protein fU 11 264 14.76 

422728 AW937826 Hs.103262 ESTs, Weakly similar to ZN91 HUMAN ZINC 14.60 

50 418601 AA279490 Hs.86368 calmegin 1456 

448999 AF179274 Hs22791 transmembrane protein wilh EGF-Bke and 1455 

445885 AI734009 Hs.127699 KIAA1 603 protein 14.44 

452712 AW838616 gbflC5-LT0054-14020(H)13-D01 LT0054 Homo" 1422 

432189 AA527941 gb:nh30c04.s1 NO.CGAP Pr3 Homo sapiens 14.12 

55 424565 AW102723 Hs.75295 guanylate cyclase 1, soluble, alpha 3 13.78 

429290 AF203032 Hs.198760 neurofilament, heavy polypeptide (200kD) 1357 

419264 AA877104 Hs293672 ESTs, Weakly similar to ALUB JiUMAN 13.40 

416445 AL043004 Hs.300678 KIAA01 35 protein 13.32 

407275 AI3641 66 gb:qw34h07jc1 NCI_CGAP JJt4 Homo sapiens 1324 

60 408389 R38438 Hs.182575 solute carrier family 15 (H+fteptidetra 1321 

446720 AI439136 Hs.140546 ESTs 13.06 

434988 A1418055 Hs.161160 ESTs 13.02 

448172 N75276 Hs.135904 ESTs 12.98 

416182 NM.004354 Hs.79069 cydlnG2 12.94 

65 420544 AA677677 Hs.98732 Homo sapiens Chromosome 16 BAG clone CIT 12.79 

445413 AA151342 Hs.12677 CGH 47 protein 12.64 

452588 AA889120 Hs.110637 homeoboxAlO 1252 

407819 R42185 Hs274803 ESTs 1250 

433444 AW975324 Hs.129816 ESTs 12.60 
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421059 


A) 6541 Jo 


HS.3Q212 


thyroid receptor interacting protein 15 


19 on 

\c.O\J 


420077 


AW512260 


Hs.87767 


ESTs 


12-24 


453930 


AA419466 


Hs.36727 


hypomeucaJ protein rUi09U3 


XCXC 


441610 


AW576148 


Hs.1 48376 


ESTs 


1220 


451009 


AA013140 


Hs.1 15707 


ESTs 


12.10 


433764 


AW753676 


Hs.39982 


ESTs 


12.16 


440266 


U29589 


Hs.7138 


cholinergic receptor, muscarinic 3 


12.04 


443912 


R37257 


Hs. 184780 


ESTs 


41 oo 
11. 92 


419526 


AI821895 


Hs.193481 


ESTs 


11.91 


423073 


BE252922 


Hs.123119 


MAD (mothers against decapentaplegic, Dr 


11.0/ 


452784 


BE463857 


Hs.151258 


hypothetical protein FU21062 


4 1 DC 
11. OO 


414422 


AA147224 


Hs.71814 


ESTs 


11.76 


450203 


AF097994 


Hs.301528 


L-kynurenine/alpha-aminoadipate aminotra 


11.68 


436679 


A1 127483 


Hs.1 20451 


ESTs, Weakly similar to unnamed protein 


4 4 en 
11.60 


440901 


AA909358 


|i m 4AAA4A 

Hs.128612 


ESTs 


4 4 CA 
11.60 


448045 


AJ297436 


Hs.20166 


prostate stem ceil antigen 


1151 


433867 


AW204232 


H&279522 


ESTs 


1150 


434980 


AW770553 


Hs2 83640 


sterol Oacyl transferase (acyf-Coenzyme 


.11.38 


425905 


AB032959 


Hs.161700 


novel C3HC4 type Zinc finger (ring finge 


11.33 


434680 


T11738 


Hs.127574 


ESTs 


11.32 


449650 


AF055575 


Hs297647 


calcium channel, voltage-dependent, L ty 


11.18 


431173 


AW971198 


Hs294068 


ESTs 


11.16 


434539 


AW748078 


HS214410 


ESTs, Weakly similar to MUC2_HUMAN MUCIN 


11.16 


410037 


AB020725 


Hs58009 


KIAA0918 protein 


11.14 


417708 


N74392 


Hs50495 


ESTs 


11.14 


458332 


AI000341 


Hs220491 


ESTs 


11.12 


420381 


D50640 


Hs.301782 


phosphodiesterase 3B, cGMP-inhibited 


11.10 


425565 


AK001050 


Hs.1 59066 


hypothetical protein RJ10188 


11.08 


425710 


AF030880 


Hs.1 59275 


solute carrier family, member 4 


11.08 


428728 


NM_016625 


Hs.191381 


hypothetical protein 


11.04 


407021 


U52077 




gbiHuman mariner! transposase gene, comp 


11.02 


410733 


D84284 


Hs.66052 


CD38 antigen (p45) 


11.02 


401714 








10.90 


434485 


AI623511 


Hs.1 18567 


ESTs 


10.89 


415786 


AW419196 


Hs257924 


hypothetical protein RJ13782 


10.87 


452340 


NMJJ02202 


Hs505 


ISL1 transcription factor, LIM/homeodoma 


10.85 


453628 


AW243307 


Hs.1 70187 


hypothetical protein 


10.72 


408063 


BE086548 


Hs.42346 


caldneurin-binding protein calsarcin-1 


10.67 


417687 


AI828596 


Hs250691 


ESTs 


10.64 


434666 


AF151103 


Hs.1 12259 


T cell receptor gamma locus 


1053 


432374 


W68815 


Hs.301885 


Homo sapiens cONA RJ11346 fts, clone PL 


10.50 


428819 


AL135623 


Hs.193914 


KIAA0575 gene product 


10.48 


413409 


AI638418 


Hs21745 


DEAD AH (Asp-Glu-Ala-Asp/His) box pofypep 


10.44 


428775 


AA434579 


Hs.143691 


ESTs 


1021 


436556 


AI364997 


Hs.7572 


ESTs 


1020 


441690 


R81733 


Hs.33106 


ESTs 


10.14 


419852 


AW503756 


Hs286184 


hypothetical protein dJ551 D25 


10.10 


421991 


NM.014918 


Hs.1 10488 


KIAA0990 protein 


10.04 


423698 


AA329796 


Hs.1098 


DKFZp434J1813 protein 


10.02 


452039 


A1922988 


Hs.172510 


ESTs 


10.00 


433043 


W57554 


Hs.125019 


ESTs 


9.98 


433927 


AI557019 


Hs.1 16467 


small nuclear protein PRAC 


9.97 


445424 


AB028945 


Hs.12696 


cortactin SH3 cfcmain-binding protein 


9.96 


432240 


AI694767 


Hs.129179 


Homo sapiens cDNA FU13581 fis, clone PL 


9.88 


433104 


AL043002 


Hs.128246 


ESTs, Moderately similar to unnamed prot 


9.84 


452744 


AI267652 


HS.305O4 


Homo sapiens mRNA; cONA Di\rzp4o4tuo2 (ir 




431217 


NMJ)13427 


Hs.250830 


Rho GTPase activating protein 6 


9.75 


427398 


AW390020 


Hs20415 


chromosome 21 open reading frame 1 1 


9.70 


446896 


T15767 


Hs22452 


Homo sapiens mRNA tor KIAA1737 protein, 


9.70 


421470 


R27496 


Hs.1378 


annexinA3 


9.64 


406554 








9.60 


401424 








958 


407902 


AL1 17474 


Hs.41181 


Homo sapiens mRNA; cDNA DKFZp727Cl91 (fr 


9.56 


423545 


AP000692 


Hs.129781 


chromosome 21 open reading frame 5 


954 


439024 


R96696 


Hs.35598 


ESTs 


951 


431548 


AI834273 


Hs.9711 


novel protein 


9.48 


409262 


AK000631 


Hs.52256 


hypothetical protein RJ20624 


9.45 


446271 


D82484 


Hs.100469 


ESTs 


9.42 


448692 


AW013907 


Hs.224276 


rnethytcrotoroyKJoenzyme A carboxylase 2 


926 
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437718 
439820 
447342 
446223 
410001 
424012 
441791 
448206 
414269 
442081 
420092 
411630 
421863 
454141 
418278 
428330 
432415 
424906 
415245 
442409 
404571 
418033 
456497 
405876 
448807 
445372 
425171 
419968 
407385 
433172 
422631 
412719 
418849 
444922 
427674 
432101 
416288 
404915 
440106 
442861 
452259 
443250 
437267 
452891 
422219 
453049 
439731 
408554 
421154 
430107 
433404 
450813 
416239 
448212 
449532 
413930 
458191 
444858 
457498 
407235 
433759 
433805 



AA281279 Hs23317 hypothetical protein FU14681 

AF274571 Hs.129142 deoxyrfoonudease il beta 

AW582962 Hs.300961 CGI-47 protein 

AA761526 Hs.163853 ESTs 

AW188551 Hs.99519 hypothetical protein FU14007 

BE182082 Hs246973 ESTs 

AF086534 Hs.187561 ESTs, Moderately similar to ALU1_HUMAN A 

AI927288 Hs.196779 ESTs 

AL360204 H&283853 Homo sapiens rnRNA full length insert cON 

AI199268 Hs.19322 Homo sapiens, Similar to RIKEN cONA 2010 

BE300091 Hs.1 19699 hypothetical protein FLJ12969 

AB041036 Hs37771 kamkrein 1 1 

AW368377 Hs.1 37569 tumor protein 63 kDa with strong homotog 

AW372449 Hs.175982 hypothetical protem RJ21 159 

BE622585 Hs.3731 ESTs, Moderately similar to 138022 hypot 

AA298489 olfactory receptor, family 51 , subfamily 

AA401863 HS22380 ESTs 

AA814043 Hs.88045 ESTs 

U42349 Hs.7 1 1 1 9 Putative prostate cancer tumor suppresso 

AI952677 Hs.1 08972 Homo sapiens rnRNA; cONA DKFZp434P228 (fr 

AW138413 Hs.182356 ATP-binding cassette, sub-family C (CFTR 

AI088489. Hs.83937 hypothetical protein 

L22524 H&2256 matrix metaHoproteinase 7 (matrifysin, 

T16971 Hs289014 ESTs, Weakly similar to A43932 mucin 2 p 

A1566086 Hs.153716 Homo sapiens rnRNA for Hmob33 protein, 3" 

N59650 Hs27252 ESTs 

BE208843 Hs.129544 hypothetical protein MGC15438 

W68180 H&259855 elongation factor-2 kinase 

AW967956 Hs.123648 ESTs, Weakly similar to AF1Q8460 1 ubinu 

AI571940 Hs.7549 ESTs 

N36417 Hs.144928 ESTs 

AW732240 Hs.300615 ESTs 

X04430 Hs.93913 interieukin 6 (interferon, beta 2) 

AA61015O Hs272072 ESTs, Weakly similar to 138022 hypotheti 

AB037841 Hs.102652 hypothetical protein ASH 1 

BE218919 Hs.118793 hypothetical protein FU10688 

AW016610 Hs.129911 ESTs 

AW474547 Hs.53565 Homo sapiens PIG-M rnRNA for mannosyttran 

AI921750 Hs.144871 Homo sapiens cONA RJ13752 fis, done PL 

Nr^003528 Hs2178 H2B Wstone family, member Q 

AI918950 Hs.11092 EphA3 

H51299 gb:yp07c06.s1 Scares breast 3NbHBst Homo 

AA864968 Hs.127699 KIAA1603 protein 

AA243837 Hs37787 ESTs 

AA317439 Hs28707 signal sequence receptor, gamma (translo 

AI041530 Hs.132107 ESTs 

AW511443 HS258110 ESTs 

N75582 Hs212875 ESTs,WeaWysimaartoDYH9_HUMANCIU 

AW978073 regulator of mitotic spindle assembly 1 

BE537217 Hs.30343 ESTs 

AI953135 Hs.45140 hypothetical protein FU14084 

AA836381 Hs.7323 nuclear receptor co-repressor/HDAC3 comp 

AA284333 Hs287631 Homo sapiens cONA RJ 14269 fis, done PL 

AA465293 Hs.105069 ESTs 

T32982 Hs.102720 ESTs 

AI739625 Hs203376 ESTs 

AL038450 Hs.46948 ESTs 

AI475858 gb:tc87d07jc1 NCI CGAP.CLL1 Homo sapiens 

W74653 Hs271593 ESTs, Moderately similar to A47582 B-cel 

M88153 Hs.75618 RAB1 1 A, member RAS oncogene family 

AI420611 Hs.127832 ESTs 

AI199738 Hs208275 ESTs, WeaWy similar to ALUA-HUMAN HI 

AI732230 Hs.191737 ESTs 

020569 Hs.1 69407 SAC2 (suppressor of actin mutations 2, y 

AA680003 Hs.109363 Homo sapiens cONA: FU23603 fis, done L 

AA706910 Hs.112742 ESTs 



924 
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9.16 
9.16 
9.14 
9.10 
9.07 
9.06 
9.05 
9.04 
9.03 
9.03 
9.02 
9.02 
8.99 
8.98 
835 
8.60 
830 
830 
8.78 
8.76 
8.75 
8.74 
8.72 
8.70 
8.66 
8.64 
8.56 
8.54 
8.52 
8.48 
8.44 
8.36 
8.31 
8.30 
827 
8.24 
822 
822 
820 
8.17 
8.15 
8.08 
8.07 
8.06 
8.06 
8.06 
8.04 
8.02 
8.00 
8.00 
7.98 
7.94 
7.94 
7.94 
7.93 
7.90 
7.85 
7.82 
732 
7.80 
7.80 
7.78 
7.78 
7.76 
7.74 
7.74 
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425485 


NM_006207 


Hs. 170040 


pIatetet-<Jerived growth factor receptor* 


iJc. 


446028 


R44714 


HS.1 06795 


Homo sapiens cdna hj 13136 us, done ni 


7 TO 

f.fc 


418555 


AI417215 


Hs.87159 


hypothetical protein rU12577 


7 7ft 
/./U 


447499 


AW262580 


HS.1 47674 


protocadherin beta 16 


/./U 


419839 


U24577 


HS.933Q4 


phospho&pase A2, group VII (platelet-ac 


/ .DO 


416857 


AA1 88775 


i j _ nrv\ ten 

MSJ292463 


ESTs 


7 fift 
/.DO 


413801 


M62246 


Hs. 35406 


ESTs, Highly similar to unnamed protein 


7 GR 
/.DO 


425480 


AB023198 


i j _ 4 r ft 4 or 

Hs.158135 


K1AA09B1 protein 


/.DO 


420120 


AL04S610 


HS.95243 


♦rinr/vJnCiAn nlnnnifinn ft r4r\r A /CI1\ 

uanscrtpuon elongation factor A [ouy 


7 CA 
/.o4 


424099 


AF071202 


MS. 139336 


ATP-oinding cassette, sur>iami[y o iur i h 


7 RA 
/.D4 


446307 


T50083 


HS.9094 


EOT** 


/.Do 


429220 


ill (A /S-T/^/V^ 

AW207206 


Hs.136319 


COlS 


7 £0 


420345 


111 fAAf AAA 

AW295230 


U- ACA44 

HS25231 


bo IS 


7 

/ .04 


429208 


AA447990 


1 1— 4 Art j"»n 

Hs.1 90478 


ESTs 


7 KJt 


447247 


AW369351 


HS287955 


Homo sapiens cuna hj iodsq lis, done in i 


7 M 
/ -OO 


440995 


T57773 


HS. 10263 


ESTS 


7 RO 


448706 


AW291095 


HS21814 


interleukin 20 receptor, alpha 


7 M 


410227 


AB009284 


Hs.61152 


exostoses (multiple)-tike 2 


7 >IQ 

/.49 


431616 


AA508552 


Hs. 195839 


ESTs, Weakly similar to 138022 nypotnsu 


7 4ft 

/.40 


434217 


AW014795 


Hs.23349 


ESTs 


7 A A 

/.44 


431467 


N71831 


Hs256398 


Homo sapiens mRNA; CDNA DKrZp434E0528 (1 


7 XO 

7 Ac 


448519 


AW175665 


HS244334 


Homo sapiens prostein mRNA, complete cds 


7.42 


446791 


Al 5322 78 


Hs.34981 


ESTs 


7 Jin 
7.40 


419743 


AW408762 


Hs.1 27478 


Homo sapiens clone 24416 mRNA sequence 


7.39 


445855 


BE247129 


Hs.145569 


ESTs 


7.36 


425211 


M18667 


Hs.1867 


progastricsin (pepsmogen C) 


7.35 


419131 


AA406293 


Hs.301622 


ESTs 


7 Ojf 
/.04 


400294 


N95796 


Hs. 179809 


Homo sapiens prostein mRNA, complete cds 


7.33 


441736 


AW292779 


Hs.1 69799 


ESTs 


7.28 


427701 


AA411101 


Hs221750 


nuclear autoantigentc sperm protein (his 


7 Ovl 


457733 


AW974812 


HsJ291971 


ESTs 


724 


418432 


M14156 


Hs.85112 


insulin-like growth factor 1 (somatomedi 


7 OO 

122. 


441201 


AW1 18822 


Hs.1 28757 


ESTs 


7^1 


419953 


BE267154 


Hs.125752 


ESTs 


720 


419991 


AJ000098 


Hs.94210 


eyes absent (DrosophHa) homoiog 1 


720 


425018 


BE245277 


Hs.154196 


E4F transcription factor 1 


7 Oft 

720 


424560 


AA158727 


Hs.150555 


protein predicted by done 23733 


7 <D 
7.1 0 


435380 


AA679001 


Hs.192221 


ESTs 


7 \ A 

7.14 


420658 


AW965215 


Hs.1 30707 


ESTs 


7 <«1 

7.12 


408291 


AB023191 


Hs.44131 


KIAA0974 protein 


7.10 


409110 


AA191493 


Hs.48778 


niban protein 


7 ift 

/.IU 


414485 


W27026 


Hs.1 82625 


VAMP (vesicle-associated membrane prole i 


7 1A 

7.10 


430039 


BE253012 


Ks.153400 


r-n-r lit 1 L . ?1— _ i Al • 14 ill IIIAkl lino 

ESTs, Weakly similar to ALU1_HUMAN ALU S 


7.10 


450832 


AW970602 


Ms.105421 


ESTs 


7.10 


417153 


X57010 


Hs.81343 


collagen, type II, alpha 1 (primary oste 


7 no. 
7.08 


412446 


AI768015 


Hs32127 


ESTs 


7.07 


412953 


Z45794 


Hs238809 


ESTs 


7 nc 
7.06 


418051 


AW1 92535 


Hs.1 9479 


ESTs 


7 ne 
7.06 


421566 


NM_000399 


Hs.1395 


early growth response 2 (Krox-20 (Drosop 


7.04 


446999 


AA151520 


Hs279525 


hypothetical protein MGC4465 


7.04 


440529 


AW20764O 


Hs.16478 


Homo sapiens cONA: FU21718 fis, clone C 


7.04 


441111 


AJ806867 


Hs.126594 


ESTs 


7.01 


451027 


AW519204 


Hs.40808 


ESTs 


7.00 


408432 


AW1 95262 




gb:xn67b05.x1 NCLCGAP_CML1 Homo sapiens 


7.00 


432223 


AA333283 


Hs285336 


Homo sapiens, clone IMAGE:3460280, mRNA 


7.00 


444805 


AB007899 


Hs.12017 


homoiog of yeast ubiquitin-protein ligas 




414212 


M136569 


Hs295940 


KIAA0187 gene product 


6.98 


431725 


X65724 


Hs2839 


Norrie disease (pseudoglioma) 


6.98 


449685 


AW296669 


Hs.66095 


ESTs 


6.97 


447313 


U92981 


Hs.18081 


Homo sapiens done DT1P1B6 mRNA, CAG rep 


6.96 


424590 


AW966399 


Hs.46821 


hypothetical protein RJ20086 


6.94 


449655 


AIQ21987 


Hs.59970 


ESTs 


6.92 


419563 


AA526235 


Hs.193162 


Homo sapiens cONA FU1 1983 fis. done HE 


6.90 


434163 


AW974720 


Hs25206 


. group XII secreted phosphoSpase A2 


6.89 


415809 


Z32789 


Hs.46601 


ESTs 


6.86 


425782 


U66468 


Hs.159525 


eel) growth regulatory with Ef -hand doma 


635 
6.84 


417958 


AA767382 


Hs.193417 


ESTs 


427408 


AA583206 


Hs2156 


RAR-related orphan receptor A 


6.79 


445873 


AA250970 


HS251946 


poly(AHbinding protein, cytoplasmic 1*1 


6.74 
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41071B 


AI920783 


Hs.191435 


ESTs 


6.74 


432383 


AA534489 




gb:nf76g1 1 £1 Nd_CGAP_Co3 Homo sapiens 


674 


438521 


AW203986 


Hs213003 


ESTs 


6.73 


435604 


AA625279 


Hs26892 
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423853 AB011537 Hs.133466 slit (DrosophHa) homolog 1 551 

453060 AW294092 HS21594 hypothetical protein MGC1 5754 551 

420407 AA814732 Hs.145010 lipopolysaccaride-spedfic response 5-fi 551 

450480 X82125 Hs.25040 zinc finger protein 239 5.90 

50 408446 AW450669 Hs.45068 hypothetical protein DKFZp434l 143 558 

421039 NM_003478 Hs.101299 cufflnS 558 

451684 AF216751 Hs.26813 CDA14 558 

436063 AK000028 Hs.250867 ribosomal protein S24 • 5.86 

410507 AA355288 Hs.271408 transitional epithefia response protein 556 

55 420179 N74530 Hs.21168 ESTs 554 

453878 AW964440 Hs.1 9025 DC32 554 
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A13475Q2 


Hs.1 73066 


hypothetical protein FU20761 


436 


417251 


AW0 15242 


Hs.99488 


ESTs, Weakly similar to YK54_YEAST HYPOT 


435 


434423 


NMJXJ6769 


Hs.3844 


LIM domain only 4 


4.35 


423427 


AL137612 


H&285848 


KIAA1454 protein 


434 


415715 


F30364 




ESTs 


4.33 


404561 








432 


422969 


AA782536 


Hs.1 22647 


N-myristoyl transferase 2 


4.32 


423685 


RP350494 


Hs.49753 


uveal autoantigen with coiled coil domai 


432 


443977 


All 20986 


Hs.1 50627 


ESTs, Weakly similar to I38022 hypotheti 


4.32 


425071 


NM 013989 


Hs.154424 


defexftnase, iodothyronine, type II 


4.32 


431583 


AL042513 


Hs262476 


S-adenosylmethionine decarboxylase 1 


431 


411379 


AI816344 


Hs.12554 


ESTs, Weakty similar to NPL4 HUMAN NUCLE 


430 


421476 


AW953805 


Hs21887 


ESTs 


4.30 


425178 


H16097 


Hs.161027 


ESTs 


4.30 


439262 


AA832333 


Hs.124399 


ESTs 


4.30 


442818 


AK001741 


Hs.8739 


hypothetical protein RJ 10879 


4.30 


421977 


W94197 


Hs.1 10165 


ribosomal protein L26 homolog 


429 


437114 


AA836641 


Hs.163085 


ESTs 


428 


420195 


N44348 


Hs300794 


Homo sapiens cONA FU1 1177 fis, done PL 


428 


418330 


BE409405 


Hs34722 


ESTs 


427 


419750 


AL079741 


Hs.183114 


Homo sapiens cONA FU14236 fis, clone NT 


426 


437065 


AL036450 


Hs.103238 


ESTs 


426 


455276 


BE176479 




gbflC3+(T0585-16O30O^22-b09 HT0585 Homo 


424 
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416292 


AA1 79233 


Hs.42390 


nasopharyngeal carcinoma susceptibility 


4.24 




423740 


Y07701 


Hs.132243 


arninopeptidase puromycin sensitive 


424 




442023 


AJ187878 


Hs.144549 


ESTs 


424 




426764 


AA732524 


Hs.151464 


ESTs, Weakly similar ta ALUCJflJMAN Hit 


423 


c 

J 


454058 


AI273419 


Hs.135146 


hypothetical protein FU13984 


423 




456511 


AA282330 


Hs.1 45668 


ESTs 


422 




448330 


AL036449 


Hs207163 


ESTs 


422 




424701 


NMJJQ5923 


Hs.151988 


mitogeivactivated protein kinase kinase 


421 


1U 


432621 


AI298501 


Hs.1 2807 


ESTs, Weakly similar to T46428 hypothec 


420 


445707 


AI248720 


Hs.1 14390 


ESTs 


420 




419910 


AA662913 


Hs.1 901 73 


ESTs, Weakly similar to A46010 X-linked 


420 




424085 


NM_002914 


Hs.1 39226 


replication factor C (activator 1) 2 (40 


420 




440749 


W22335 


Hs.7392 


hypothetical protein MGC3199 


420 


Id 


442787 


W93048 


Hs2272Q3 


hypothetical protein MGC2747 


420 


443414 


R54594 


Hs25209 


ESTs 


420 




443556 


AA256769 


Hs.94949 


melhy!malonyl-CoA epimerase 


420 




444170 


AW613879 


Hs.1 02408 


ESTs 


420 




446751 


AA766998 


Hs.85874 


Human DNA sequence from done RP11-16L21 


420 


on 


421041 


N36914 


Hs.1 4691 


ESTs, Moderately similar to 138022 hypot 


4.19 


447476 


BE293466 


Hs20880 


ESTs, Weakly similar to 138022 hypotheti 


4.19 




448543 


AW897741 


Hs21380 


Homo sapiens mRNA; cDNA DKFZp586P1 124 (f 


4.18 




410294 


AB014515 


Hs28889t 


KIAA0615 gene product 


4.18 




433607 


AA6020O4 


Hs23260 


ESTs 


4.18 


25 


435552 


AI668636 


Hs.193480 


ESTs, Moderately similar to ALU6_HUMAN A 


4.18 


447124 


AW976438 


Hs.17428 


RBP1 -See protein 


4.18 




453308 


AW959731 


Hs. 32538 


ESTs 


4.17 




439328 


W07411 


Hs.118212 


ESTs, Moderately similar to ALU3_HUMAN A 


4.16 




430473 


AW1306S0 


Hs299842 


ESTs 


4.16 


30 


437257 


AI283085 


Hs290931 


ESTs, Weakly similar to YFJ7JTEAST HYPOT 


4.16 


438018 


AK001160 


Hs.5999 


hypothetical protein FU10298 


4.16 




443857 


AI089292 


Hs287621 


hypothetical protein FU 14069 


4.15 




446711 


AF169692 


Hs.12450 


protocadherin 9 


4.15 




419103 


Z40229 


Hs.96423 


hypothetical protein RJ23033 


4.14 


35 


405403 








4.14 


407378 • 


AA299264 




ESTs, Moderately similar to 138022 hypot 


4.14 




408986 


AW298602 


Hs.197687 


ESTs 


4.14 




418727 


AA227609 


Hs.94834 


ESTs 


4.14 




434400 


AI47B211 


Hs.186896 


Homo sapiens cDNA FU1 1417 lis, done HE 


4.14 


40 


438578 


AA811244 


Hs.164168 


ESTs 


4.14 


450459 


AI697193 


Hs299254 


Homo sapiens cDNA: FU23597 fis, done L 


4.14 




429887 


AW366286 


Hs.145696 


spPdng factor (CC1.3) 


4.13 




448148 


NM_016578 


HS205G9 


HBV pX associated protefn-8 


4.13 




450316 


W84446 


Hs.17850 


hypothetical protein MGC4643 


4.12 


45 


417531 


NM_003157 


Hs.1087 


serine/threonine kinase 2 


4.12 


431592 


R69016 


Hs293871 


hypothetical protein MGC10895s 


4.12 




432463 


AA548518 


Hs.186733 


ESTs 


4.12 




433613 


AA836126 


Hs.5669 


ESTs 


4.12 




434739 


AA304487 


Hs.144130 


ESTs 


4.12 


50 


438259 


AW205969 


Hs.131808 


ESTs 


4.12 


425810 


AI923627 


Hs.31903 


ESTs 


4.10 




432672 


AW973775 


Hs.130760 


myosin phosphatase, target subunit 2 


4.10 




433345 


AI681545 


Hs.152982 


hypothetical protein FU13117 


4.10 




432712 


AB016247 


Hs288G31 


sterol*C5-desaturase (fungal ERG3, delta 


4.09 


55 


453020 


AU62039 


Hs.31422 


Homo sapiens mRNA; cDNA DKFZp434M229 (fr 


4.09 


412045 


AA099802 


Hs.4299 


transmembrane, prostate androgen induced 


4.09 




435114 


AA775483 


Hs288936 


mitochondrial ribosomal protein L9 


4.08 




443204 


AW205878 


HS29643 


Homo sapiens cONA FU13103 Us, done NT 


4.08 




445459 


AI478629 


Hs.158465 


likely orthotog of mouse putative IKK re 


4.08 


60 


438938 


H46212 


Hs.137221 


ESTs 


4.07 


454119 


BE549773 


Hs.40510 


uncoupling protein 4 


4.06 




411000 


N40449 


Hs201619 


ESTs, Weakly similar to S38383 SEB4B pro 


4.06 




418926 


AA232658 


Hs.87070 


UDP^tucose:grycoprotein glucosyrtransfe 


4.06 




424432 


AB037821 


Hs.146858 


protocadherin 10 


4.06 


65 


449673 


AA002064 


Hs.18920 


ESTs 


4.06 


429299 


AI620463 


Hs.99197 


hypoMca! protein MGC131Q2 


4.06 




422174 


AL049325 


Hs.1 12493 


Homo sapiens mRNA; cDNA OKFZp564D036 (fr 


4.05 




455497 


AA1 12573 


HS285691 


Homo sapiens prostein mRNA, complete cds 


4.05 




415138 


C18356 


Hs.78045 


tissue factor pathway Inhibitor 2 


4.04 




402791 








4.04 
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426792 


AL044854 


Hs.172329 


KIAA0576 protein 


4.04 


438660 


U95740 


Hs.6349 


Homo sapiens, clone IMAGE30t0666, mRNA, 


4.04 


442768 


AL048534 


Hs.48458 


ESTs, Weakly similar to ALU8.HUMAN ALU S 


4.04 


447568 


AF155655 


Hs.18885 


CGM 16 protein 


4.04 


428342 


AI739168 


Hs.131798 


Homo sapiens cONA FU13458 fis, clone PL 


4.04 


453439 


AI572438 


Hs.32976 


guanine nucleotide binding protein 4 


4.02 


453857 


AL08Q235 


Hs.35861 


DKFZP586E1621 protein 


4.02 


428249 


AA130914 


Hs.183291 


zinc finger protein 268 


4.02 


432015 


AL1 57504 


Hs.159115 


Homo sapiens mRNA; cDNA DKFZp586O0724 (f 


4.02 


445495 


BE622641 


Hs.38489 


ESTs, Weakly similar to 138022 hypotheti 


4.02 


451746 


M86178 




ESTs 


4.02 


452211 


AI985513 


Hs.233420 


ESTs . 


4.02 


453046 


AA284040 


Hs.219441 


ESTs, Highly similar to CA5B_HUMAN CARBO 


4.02 


456038 


AA203285 


Hs.294141 


ESTs, Weakly similar to aftemativety sp 


4.02 


452449 


AW068658 


HS20943 


ESTs 


4.02 


407204 


R41933 


Hs.140237 


ESTs, Weakly similar to ALU1_HUMAN ALU S 


4.01 


428046 


AW812795 


Hs.155381 


ESTs, Moderately simflar to 138022 hypot 


4.01 


438520 


M706319 


Hs.98416 


ESTs 


4.01 


443292 


AK00Q213 


Hs.9196 


hypo thetica! protein 


4.01 


432715 


AA247152 


Hs200483 


ESTs, Weakly similar to KIAA1074 protein 


4.00 


403797 








4.00 


418347 


AA216419 


Hs.269295 


gb:nc16e03.s1 NCLCGAP Pr1 Homo sapiens 


4.00 


419459 


AW291128 


Hs.278422 


DKF2P586G1 122 protein 


4.00 


420911 


U77413 


Hs.100293 


O-linked N-acetylglucosamine (GlcNAc) tr 


4.00 


425176 


AW015644 


Hs.301430 


TEA domain family member 1 (SV40 transcr 


4.00 


447505 


AL049266 


Hs.18724 


Homo sapiens mRNA; cONA OKFZp564F093 (fr 


4.00 


453773 


AL133761 




gb:DKFZp761C1413__r1 761 (synonym: hamy2) 


4.00 


434384 


AA631910 


Hs.162849 


ESTs 


3.99 


422471 


AA311027 


HS271894 


ESTs, Weakly similar to I38022 hypotheti 


3.99 


427386 


AW836261 


Hs.177486 


ESTs 


3.98 


433394 


AI907753 


Hs.93810 


cerebral cavernous malformations 1 


3.98 


441269 


AW015206 


Hs.178784 


ESTs 


3.97 


419629 


AB020695 


Hs.91662 


KIAA0888 protein 


3.96 


435008 


AF150262 


Hs.162898 


ESTs 


3.96 


456649 


R74441 


Hs.1 17176 


pory(A)-binding protein, nuclear 1 


3.96 


418723 


AA504428 


Hs.10487 


Homo sapiens, clone IMAGE:3954132, mRNA, 


3.96 


428738 


NM 000380 


Hs.192803 


xeroderma pigmentosum, complementation g 


3.95 


430456 


AA314998 


Hs.241503 


hypothetical protein 


3.95 


422017 


NMJJQ3877 


Hs.1 10776 


STAT induced STAT inhibitor-2 


3.95 


409960 


BE261944 


Hs.153028 


hexokinase 1 


3.95 


455309 


AW894017 




gb:RC4.NN002M5040(H)12-g04 NN0027 Homo 


3.95 


450295 


AI766732 


Hs.201194 


ESTs 


3.94 


456660 


AA909249 


Hs.1 12282 


solute carrier (amity 30 (zinc transport 


3.94 


410908 


AA121686 


Hs.10592 


ESTs 


3.94 


447145 


AA761073 


Hs.192943 


TRAF family member-associated NFKB activ 


3.94 


449318 


AW236021 


Hs.108788 


Homo sapiens, Similar to RJKEN cDNA 5730 


3.94 


449869 


W57990 


Hs.60059 


Homo sapiens cDNA FU1 1478 fis, done HE 


3.94 


411887 


AW182924 


Hs.128790 


ESTs 


3.93 


437531 


AM00752 


Hs.112259 


T cell receptor gamma locus 


3.93 


452238 


F01811 


Hs.187931 


ESTs 


353 


410486 


AW235094 


Hs.193424 


zinc finger protein 


3.92 


424882 


AI379461 


Hs.153636 


far upstream element (FUSE) binding prot 


3.92 


426269 


H153Q2 


Hs.168950 


Homo sapiens mRNA; cDNA DKFZp566A1046 (f 


352 


427043 


AA397679 


Hs.298460 


ESTs 


3.92 


440404 


AI015881 


Hs.125616 


mitochondrial ribosomal protein S5 


3.92 


452762 


AW501435 


Hs.171409 


v-akt murine thymoma viral oncogene homo 


3.92 


453058 


AW612293 


Hs.288684 


Homo sapiens cONA FU1 1750 fis, clone HE 


3.92 


423583 


AL122055 


Hs.129836 


KIAA1 028 protein 


3.92 


408001 


AA046458 


Hs.95296 


ESTs 


3.92 


419197 


N48921 


Hs.27441 


KIAA1615 protein 


3.91 


428695 


AJ355647 


Hs.189999 


purinergic receptor (family A group 5) 


3.91 


401747 








3.91 


410011 


AB020641 


Hs.57856 


PFTAIRE protein kinase 1 


3.91 


432205 


AI806583 


Hs.125291 


ESTs 


3.91 


447857 


AA081218 


Hs.58608 


Homo sapiens cONA FU14206 fis, done NT 


3.91 


446494 


AA463276 


Hs.288906 


WW Domain-Containing Gene 


3.91 


409928 


AL137163 


Hs.57549 


hypothetical protein (U47384 


350 


411598 


BE336654 


Hs.70937 


H3 histone family, member A 


3.90 


424790 


AL1 19344 


Hs. 13326 


ESTs, WeaWy similar to 2004399A chromos 


3.90 
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425707 


AF115402 


Hs.11713 


E744ike factor 5 (ets domain transcript 


3.90 


431325 


AW026751 


H&5794 


ESTs, Weakly similar to 2109260A B ceO 


3.89 


451B06 


NM_003729 


Hs.27076 


RNA 3-terrninal phosphate cyclase 


3.89 


401045 








3.89 


433023 


AW864793 


Hs.34161 


thrombospondin 1 


3.89 


452160 


BE378541 


Hs.279815 


cysteine sulfinic acid decarboxylase-rel 


3.89 


437372 


AA323968 


HS283631 


hypothetical protein DKFZp547G183 


3.89 


417067 


AJ001417 


Hs.81086 


solute carrier family 22 (extraneuronal 


3.88 


410467 


AF1Q2546 


Hs.63931 


dachshund (Drosophita) homotog 


3.88 


422660 


AW297582 


Hs^37062 


hypothetical protein RJ22548 similar to 


3.88 


431930 


AB035301 


Hs^72211 


cadherin 7 r type 2 


3.88 


453047 


AW023798 


HSu286025 


ESTs 


3.68 


433891 


AA613792 




gbmo97h03.s1 NCI_CGAP_Pr2 Homo sapiens 


3.88 


401785 








3.88 


431088 


AA491824 


Hs.196881 


ESTs 


3.88 


451952 


AL120173 


Hs^01663 


ESTs 


3.87 


422089 


AA523172 


Hs.103135 


ESTs, Weakry sim2ar to SFR4 _HUMAN SPLIC 


3.87 


452277 


AL049013 


Ks28783 


KIAA1223 protein 


3.87 


438279 


AA805166 


Hs.165165 


HIV-1 rev binding protein 2 


3.86 


458229 


AI929602 


Hs.177 


phosphattdytinositol glycan, dass H 


3.86 


406414 








3.86 


417193 


A1922189 


Hs.288390 


hypothetical protein RJ22795 


3.65 


413174 


AA723564 


Hs.191343 


ESTs 


3.85 


433332 


AI367347 


Hs.127809 


Homo sapiens dona TCCCTA00151 mRNA sequ 


3.65 


411089 


AA456454 


Hs.1 18637 


cell division cycle 2-like 1 (PITSLRE pr 


3.85 


412494 


AL133900 


Hs.792 


ADP-ribosyiation factor domain protein 1 


3.84 


413530 


AA130158 


Hs.19977 


ESTs, Moderately similar to ALU8_HUMAN A 


3.84 


459592 


AL037421 


Hs.208746 


ESTs, Moderately similar to pot ORF 1 [ 


3.84 


418329 


AW247430 


Hs.84152 


cystathionine-beta-synthase 


3.83 


451468 


AW503398 


Hs.210047 


ESTs, Moderately similar to 138022 hypot 


3.83 


434804 


AA649530 




gbms44f05.s1 NCI_CGAP_Arv1 Homo sapiens 


3.83 


401819 








3.82 


424179 


F30712 




Homo sapiens, done IMAGE:4285740, mRNA 


3.82 


424850 


AA151057 


Hs.153498 


chromosome 18 open reading frame 1 


3.82 


426472 


BE246138 


Hs.30853 


ESTs 


3.82 


426625 


T78300 


Hs.171409 


serologically defined colon cancer antig 


3.82 


427585 


D31152 


Hs.179729 


collagen, type X, alpha 1 (Schmid metaph 


3.82 


427756 


A1376540 


Hs.15574 


ESTs 


3.82 


444701 


A1916512 


Hs.198394 


ESTs 


3.82 


423052 


M28214 


Hs.123072 


RAB3B, member RAS oncogene family 


3.82 


429259 


AA420450 


Hs£92911 


ESTs, Highly simBar to S60712 band-6-pr 


3.82 


416111 


AA033813 


Hs.79018 


chromatin assembly factor 1, subunft A ( 


3.62 


433586 


T85301 




gb.-yd78d06.s1 Soares fetal liver spleen 


3.81 


438527 


AI969251 


Hs.143237 


RAB7, member RAS oncogene family-like 1 


3.81 


410297 


AA148710 


Hs.159441 


lumican 


3.61 


429898 


AW1 17322 


Hs.42366 


ESTs 


3.81 


409079 


W87707 


Hs.82065 


interteukin 6 signal transducer (gp130, 


3.80 


419423 


D26488 


Hs.90315 


KIAA0007 protein 


3.80 


429643 


AA455889 


Hs.187548 


FYVE-finger-containing Rab5 effector pro 


3.80 


431499 


NM-001514 


Hs.258561 


general transcription factor KB 


3.80 


445060 


AA830811 


Hs.88808 


ESTs 


3.80 


449419 


R34910 


Hs.1 19172 


ESTs 


3.80 


450584 


AA040403 


Hs.60371 


ESTs 


3.80 


426137 


AL040683 


Hs.167031 


DKFZP566D133 protein 


3.79 


420185 


AL044056 


Hs.158047 


ESTs 


3.79 


410076 


T05387 


Hs.7991 


ESTs 


1 "7D 
O./O 


444078 


BE246919 


Hs.10290 


U5 snRNP-spedfic40 kDa protein (hPrp8- 


3.78 


417318 


AW953937 


Hs.12891 


ESTs 


3.78 


414664 


AA587775 


Hs.66295 


muiti-PDZ-domain-containing protein 


3.78 


410275 


U85658 


Hs.61796 


transcription factor AP-2 gamma (activat 


3.77 


410503 


AW975746 


Hs.188662 


KIAA1702 protein 


3.77 


434170 


AA626509 


Hs.122329 


ESTs 


3.77 


421838 


AW881089 


Hs.108806 


Homo sapiens mRNA; cDNA DKFZp566M0947 (f 


3.77 


425268 


AI807883 


Hs.156932 


Homo sapiens cONA FU20653 fis, clone KA 


3.76 


431696 


AA259068 


Hs.2678t9 


protein phosphatase 1, regulatory (inhfo 


3.76 


411990 


AW963624 


Hs.31707 


ESTs, Weakry simaar to YEW4.YEAST HYPOT 


a76 


430291 


AV660345 


Hs.238126 


CGM9 protein 


3.76 


448779 


BE042877 


Hs.177135 


ESTs 


3J6 


452682 


AA456193 


Hs.155606 


progesterone membrane binding protein 


a75 
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452598 


AIM 1 504 


Hq6R547 


ESTs. Weaktv similar to ALLT7 HUMAN ALU S 


3.75 


A3QAQR 


AAQftA7Q.1 
nnSUO/OI 


He c.0907 


f^l 1 1 R nmfotn 
vi 1 1 0 piuioui 


3.75 


AA1Y>ZQ 
HHUcOO 


AI741R33 
nil H IOOO 




FSTs 


3.74 


400040 


Ai ioinn7 

ALltlUB/ 


He 9GRAAR 
nSx904U0 


IflAAflRRR nana nmrhirl 


^74 




HMIOUUUU 


He 1 

ns. 1 0/ oi» 0 


PQTe Woalttv einnHar tn IHROOQ /lafartneu 
coia, vvodiuy SHiiittaJ iu oojcoo yaiauiuoy 


3.74 


4cU050 


A 199 ASOO 
AlZc4002 


He flRun 


CO IS 


^ 74 


HO IDO/ 


M10/9J0U 




hvnnfhflHMl nrntoln PI Mn5M 


3.74 


AAT\A 1 1 
•WW 1 1 


MqnocR 


He 1RR.Q71 


hvnnfhptlral nrntoin n^P7n4.1dfi141^ 


3.74 


4ARQ17 








3.74 


** lirWU 


ARA9ARRQ 


He QfMIQ 


l/|A Anooo nmtotn 


3.74 


*KJ l£OU 


D COhO L.VO 






3.73 






He 2440 




3.73 


40U144 




He 1R7RQ4 


PRftl nrntoin- PD^IP-RQ.Iil'a nrntoin 

CnUL piOtaUl, CnUIWrOO"ilKe fjiuiuin 


q 70 


400094 




He 97fiQ3 


nontirhflnrnhrf ienmorocfi /pu/«fnnhiltn\.l 

pepuuyiproiyi louniciasy ^cyuupnuiri/*i 


^ 72 


AAfiR.07 
**4U0£/ 


A VRR71 17 
MVOO/ 1 1/ 


He 1A41RA 


PQTc MnHorntolu cimikir tn CCCCC7 otnha 


q 79 


441W00 


AIR79ftOR 


HeQftl? 
na.3U 1 1 


PQTe Wpabfu eimtlarfn C0fifiC.n nWAJimrti 
CO id, weaWy olUukli W OtDOOU UiMM*tlu10l 




HiWtf O 




He 134759 

lis. igtf J« 


ESTs 


3.72 


/MACRO 
440000 


ocri/cco. 
□co i4o»y 


He infinoo 

no. lUOOcO 


nypuuioucai pio win muv i 4 / y / 


q 79 


JHG/Y7C 


1 97/I7G 

L2/4/9 


ns./ / ooy 


rneareicn aiajaa region yeno a ico 


q 79 


400544 


A 17000 1 1 


He 1 RE. 979 

ns. looo/t 


PQTc Mnriaralotu eimilar tn AI 1 11 HI IM AM A 

co i s, Mooeraieiy similar 10 mlu t_nuwiAN a 


% 71 
0./ 1 






He 1RAA3 
no. iouoo 


hunnthotirat nmtoln PI I91R77 

uypuumuixii piutDUi n_j£ 10/ / 


^71 
0./ 1 


449897 


AWo 13042 


nS.24io5 


transmembrflne protdin vszatinj hypo&ieti 


q 71 
0./ 1 


i4OA0O7 


AIR9R979 


He Rft^O^ 


PQTc Woakh/ cimilar tn AI 1 11 HI IMAM AI 1 1 <! 

co 1 s, weawy similar 10 mlu i_nuwinfM mlu 0 


o./u 


4^o0oo 


DOR 1 CO 


He lOJfiOR 
US. 1940UO 


nomo sapiBns, cione mul».d4uo, mruMM, comp 


^7H 
o./u 


429340 


No59oo 


He 1O0A9O 
MS. 1934^9 


Homo sapiens mRNA; cDNA DKFZp434M2216 (1 


q 7ft 


437777 


A A7CQAOQ 

AA7ooQ9o 


ns.io9U/9 


CCTp 
CO IS 


q 7n 
o./u 


440351 


A CAOOOOO 


Up 74 70 

MS./ 1/9 


RAD1 (S. pombe) homotog 


q 7fi 
o./u 


443603 


QCCA9Cfl1 
DC5U20U1 


Up 19/1 90. 0 


CCTp U/nqUii ptnnMip in I/IAA1ACO nmlnin 

CO IS, WBoKly SUTtlldi 10 IMAAIUOo prOlGUl 


q 7n 


44o9o5 


DCO/I9Q79 
bt24£O/0 


Ur. ice77 


W0 repeat domain 15 


^ 7fl 


412350 


Aio59ou0 


Ue 70Q0C 


protein tyrosine phosphatase, non-recept 


q 7n 


433852 


AIQ7COOQ 

Alo/o029 


He 19RR9Q 
rlS.l£0Oc9 


PQTc 
COl 5 


7n 
o./u 


433142 


AI 40AC07 


nS.i iuo4U 


CCTe 
CO IS 


q rq 


419994 


AA282881 


riS.190057 


colS 


9 CQ 

o.oy 


41202O 


AI9/24U2 




nypouiBUCai proicin MULui04a 


q rq 


431416 


AA532718 


KS.1 78004 


coTS 


9 CO 


439444 


A1277o52 


HS-54578 


cols, weawy similar to 100022 nypoineu 


9 CO 

0.00 


414709 


AA/047U3 


Up 77H01 


Sp2 transcription factor 


q rr 
0.00 


447397 


B £247676 


HS. 18442 


E-1 enzyme 


q rq 
0.00 


405718 






9 CQ 

0.00 


425217 


ai imewic 

AU07oo9o 


HS. 155174 


CDC5 (ceil division cycle 5, b. pomoe, n 


9 CQ 
O.OO 


442242 


AV647908 


HS.90424 


nomo sapiens cuna. rLJ2j2o5 us, clone n 


9 CQ 
O.OO 


424090 


Bb53B356 


Hs.151777 


eulcaryofic translation initiation (actor 


9 CQ 
O.DO 


421734 


AJoioo24 


Uf 4A7>(>li4 

nS.lU/444 


Uaiim rtAHiAnp aAIIA CI I9ACC9 fip pJ/\nA A 

nomo sapiens cuna ruj2uoo2 ns, oono i\a 


q R7 

0.0/ 


427221 


1 ic/*aq 
LI 5409 


nS.l/4UUr 


von Hippe)*Lindaii syndrome 


q R7 


439 564 


AI 720078 


HS291997 


cois, weawy similar to A475B2 B-ceii gr 


q cr 

O.OO 


J Art J AO 

402408 








9 cc 
O.OO 


425327 


W03242 


u#» itiono 

n$.44898 


Homo sapiens done tgcctaooi5 1 mHNA soqu 


9 CC 

o.bo 


42/ lib 


AIA/DQACC9 


MS.1 145/4 


CCTe 
COlS 


1 RR 
O.OO 


427356 


AlAfAOO>IQO 

AVV02o4o2 


Uo 07D/Q 

HS.97B49 


CCTp 

CO IS 


q rr 
0.00 


452946 


X95425 


II _ Oi/WO 

rls.31092 


tpnA5 


9 cc 
O.OO 


419078 


M93119 


IP. Afltfti 

HS.89584 


insulinoma-associated 1 


9 CC 
O.DO 


416295 


AI064824 


Hs.193385 


ESTs 


O.OO 


427144 


X95097 


Hs2126 


vasoactive intestinal peptide receptor 2 


9 CC 
O.OO 


447500 


Ai38l900 


HS.159212 


ESTs 


9 CC 
O.OO 


453127 


Aio9oo71 


HS^94110 


ESTs 


9 CC 
O.DO 


423396 


AIOQ9CKC 

AkJoZoob 


Up 107QCA 

nS.12/950 


bromodomain-contatning 1 


q rr 

0.00 


419346 


AI830417 




polybromo 1 


3.64 


441540 


C01367 


Hs.127128 


ESTs 


3.64 


446501 


AI302616 


Hs.150819 


ESTs 


3.64 


459527 


AW977556 


HS591735 


ESTs, Weawy similar to 178885 serine/th 


3.63 


446320 


AF126245 


Hs.14791 


acyl-Coenzyme A dehydrogenase family, me 


3.63 


435706 


W31254 


Hs.7045 


GL004 protein 


3.63 


400110 








3.62 


410313 


R10305 


Hs.185683 


ESTs 


3.62 


414713 


BE465243 


Hs.12664 


ESTs 


3.62 


436279 


AW900372 


Hs.180793 


ESTs, Weawy similar to S65657 e!pha-1C- 


3.62 


439818 


AL360137 


Hs.19934 


Homo sapiens mRNA tut) length Insert cON 


3.62 


451797 


AW663858 


HS56120 


small inducible cytokine subfamSy E, me 


3.62 


451294 


AI457338 


HS29894 


ESTs 


3.62 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



434194 


»rj 4 no AT 

Ar 119847 


Ks^ 83940 


Homo sapiens PR0 1550 mRNA, partial cos 


3.62 


4045*00 








3.62 


408101 


AW968504 


Hs. 123073 


CDC2-re!ated protein kinase 7 


O CO 

3.62 


435846 


AA700870 


ft i ^ inAJ 

Hs.14304 


ESTs 


O C4 

3.61 


VOODOO 

402000 


N51075 


Hs.47191 


ESTs 


Q CI 

0.0 1 


427276 


A A in/Men 

AA4O0269 


Hs.49598 


ESTs 


O C4 

3.61 


4004y5 


AW373784 


t_i_ f 4 
HS.71 


alpha-2*gtycoprotein 1, zinc 


o CA 

O.w 


403137 






O CA 

3.60 


4U4165 








O CA 

3.60 


409571 


A ACAAO^A 

AA504249 


II _ 1O7C0C 

HS.1 87585 


fcSTS 


q cn 


410561 


occ/nocc 
Bt540Z5o 


Hs.6994 


Homo sapiens cDNA. PLJ22044 lis, clone H 


O CA 

3.60 


«*1«<:4 


Obul04ZZ 


Lie 7COCH 


vxen rusione larruiy, memo Br r 


Q CA 


404£:o 


7AOf\A7 


U. OOOQ7D 

HS.2oo9fO 


Homo sapiens PR02751 mRNA, complete cds 


O CA 

O.oU 




A ATQ1XQ1 
AM/Oi4a 1 


HS. i/o51o 


IninAlkAliml nM uU IJP/"»I AQTCi 

nypouieucai protein MbOi4o79 


O CA 


437162 


, AW0O55O5 


Hs.5464 


thyroid hormone receptor coactivating pr 


3.60 


437444 


LI ACfV\Q 

H4O008 


Hs.31518 


coTS 


3.60 


JIAA04A 

404210 








O CO 

0.59 


446157 


BE270828 


Hs.131740 


Homo sapiens cDNA: FU22562 fis, clone H 


359 


437587 


AI591Z22 


HS. 122421 


Human UNA sequence from clone RP1-187J1 1 


3.58 


423147 


AA987927 


Hs.131740 


Homo sapiens cOMA: FU22562 fis, clone H 


3.57 


452226 


AA024898 


hs£96002 


ESTs 


3.56 


443775 


AF291664 


Hs.204732 


matrix metafloproteinase 26 


3.56 


452501 


AB037791 


Hs29716 


hypothetical protein RJ10980 


3.56 


428647 


AA830050 


ft 1* iAiAi J 

HS.1 24344 


ESTs 


3.56 


422443 


NM_014707 


Hs.1 16753 


h'Btonedeacetylase78 


355 


447966 


AA340605 


Hs. 105887 


ESTs, Weakly similar to Homolog of rat Z 


3.55 


420892 


AW975076 


Hs.1 72589 


nuclear phosphoproteln similar to S. cer 


3.55 


420230 


AL034344 


Hs£98020 


forkhead box C1 


3.55 


418428 


Y12490 


Hs.85092 


thyroid hormone receptor tnteractor 1 1 


3.54 


428949 


AA442153 


Hs.1 04744 


hypothetical protein DKFZ|>434J0617 


3.54 


444929 


Ai 685841 


Hs.161354 


ESTs 


354 


433339 


AF0 19226 


Hs.8036 


glioblastoma overexpressed 


3.54 


424369 


R87622 


Hs.26714 


KIAA1831 protein 


3.54 


433002 


AF048730 


HS279906 


cycfinTI 


3.53 


435425 


H16263 


Hs.31416 


ESTs 


353 


415621 


AI648602 


Hs.131189 


ESTs 


3.53 


416974 


AF010233 


Hs.80667 


RALBP1 associated Eps domain containing 


3.53 


405793 








352 


409770 


AW499536 




gb:UI-HF-BR0p-aji-c*12-<HJlj1 NIH_MGC_5 


352 


425305 


AA3S3Q25 


Hs.1 55572 


Human clone 23801 mRNA sequence 


352 


428939 


til lAAArrA 

AW236550 


Hs.131914 


ESTs 


352 


438383 


AA806349 


Hs.44698 


ESTs 


3.52 


443703 


AV646177 


Hs213021 


ESTs 


3.52 


457940 


AL360159 


Hs.3o445 


Homo sapiens TRIpartjte motif protein ps 


352 


402444 








3.52 


409643 


AW450866 


Hs.257359 


ESTs 


351 


418250 


U29926 


Hs.83918 


adenosine monophosphate deaminase (isofo 


351 


432745 


AJ821926 


Hs.269507 


gb:nt78f05.x5 NCI_CGAP_Pr3 Homo sapiens 


351 


414222 


AL135173 


Hs.878 


sorbitol dehydrogenase 


351 


430061 


AB037817 


Hs.230168 


KIAA1396 protein 


3.51 


421491 


H99999 


Hs.42736 


ESTs 


3.50 


422384 


AA224077 


Hs.42438 


Sm protein F 


3.50 


434565 


T52172 




ESTs 


3.50 


438379 


N23018 


Hs.171391 


C-terminal binding protein 2 


3.50 


439741 


BE379646 


Hs.6904 


Homo sapiens mRNA fuO length insert cON 


350 


447311 


r30"7A4A 

HO/UiU 


HS.0041/ 


nomo sapiens cuna. ruz^oUb us, oone a 


O CA 

0.50 


447805 


AW627932 


Hs.19614 


gemin4 


350 


454265 


H03556 


Hs.300949 


EST s, Weakly similar to thyroid hormone 


350 


418838 


AW385224 


Hs.35198 


ectonucteorJde pyrophosphatase/phosphodi 


3.50 


448804 


AW512213 


Hs.42500 


ADP-ribosylation factor-like 5 


3.50 


409617 


BE003760 


Hs.55209 


Homo sapiens mRNA; cONA DKFZp434K0514 (f 


3.49 


434075 


AW003416 


Hs.160604 


ESTs 


3.49 


444190 


AI878918 


Hs.10526 


cysteine and glytine-rich protein 2 


3.49 


435017 


AA338522 


Hs.12854 


angiotensin Ii, type 1 receptor-associat 


3.48 


423445 


NMJH4324 


Hs.128749 


aIpha^trrylaqfKk)A racemase 


3.48 


420271 


AI954365 


Hs.42892 


ESTs 


3.48 


443684 


AI681307 


Hs.166674 


ESTs 


3.48 


444168 


AW379879 




gb:RC1-HT0256-081 1994)1 1-W1 HT0256Homo 


3.48 


446074 


AA079799 


H&29263 


hypothetical protein FU1 1 896 


3.48 
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452582 


AL137407 


Hs.29911 


Homo sapiens mRNA; cONA DKFZp434M232 (ir 


3.48 


431542 


K63010 


Hs£740 


ESTs 


3.48 


432697 


AW975050 


H&293892 


f"£»T_ tkl I.I t W * - tl 111 ill 11 1 A VI *l 1 1 O 

ESTs, WeaWy similar to ALU4_HUMAN ALU 5 


O AO 


435572 


AW975339 


Hs.239828 


ESTs, Weakly similar to GA62JiUmAN RETRO 


3.47 


407192 


AA609200 




gbal12e02.s1 SoaresJestis_NHT Homo sap 


3.47 


413435 


X51405 


Hs.75360 


cartioxypeptkJase E 


3.46 


447210 


AF035269 


Hs.1 7752 


phosphatidyteenne-specific phospholipas 


O AC 

3.4b 


447958 


AW796524 


Hs.68644 


Homo sapiens microsomal signal peptidase 


3.46 


425312 


AA354940 


Hs.1 45958 


ESTs 


3.46 


442007 


AA301116 


Hs.1 42838 


nucleolar phosphoprotain Nopp34 


O AC 




AWUU/UDO 


He 10Q/1Q 

ns. iOb^a 


P^Tc WoaVh/ cimitar tft Ml ffJlAM PHI 1 A 

co is, tveawy oimiiai vj L»A£Q_nuivif\H vasllm 


3.45 


426931 


NMJJ03416 


Hs^076 


zinc finger protein 7 (KOX 4, done HF.1 


3*45 


408739 


W01556 


Hs.238797 


ESTs, Moderately similar to 138022 hypot 


3.45 


436024 


AI800041 


Hs.190555 


ESTs 


3.45 


408418 


AW963897 


Hs.44743 


K1AA1435 protein 


3.45 


409151 


AA306105 


Hs.50785 


SEC22, vesicle trafficking protein (S. c 


3.44 


418626 


AW299503 


Hs.135230 


ESTs 


3.44 


420560 


AW207748 


Hs£9115 


ESTs 


3.44 


420686 


AI950339 


Hs.40782 


ESTs 


3.44 


428870 


AA436831 


Hs.36049 


ESTs 


3.44 


436754 


AI061288 


Hs.133437 


ESTs 


3.44 


437960 


AI669586 


Hs.222194 


ESTs 


3.44 


452300 


AW628045 


Hs.28896 


Homo sapiens mRNA fuD length insert cDN 


3.44 


421887 


AW161450 


Hs.109201 


CGl-86 protein 


3.44 



159 



WO 02/30268 



PCT/US01/32045 



TABLE 5A shows the accession numbers for those primekeys lacking a unigenelD in Tables 
5, 6, and 7. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey. 

CAT number. 
Accession: 



Unique Eos probeset identifier number 
Gene cluster number 
Genbank accession numbers 



Pkey 

407596 
408432 
409752 
409770 
411440 
411479 

411624 
412991 
414269 
415123 
415715 
416268 
416289 
417730 
418636 
419346 
419536 
420111 
422219 
424179 
424242 
428002 
429163 
432189 
432340 
432363 
432966 
433586 
433641 



433687 
433891 
434415 
434565 
434804 
437113 
444168 
448212 
448310 
451746 



CAT number 

1003489J 

1058667.1 

11530U 

1154048J 

124577.1 

1247077.1 

1252166J 

134248J 

143133J 

1523390.1 

1548818J 

1585983.1 

1586037.1 

1695795.1 

177402.1 

184129.1 

1B5688J 

190755.1 

213547.1 

236389.1 

237181.1 

285602J 

300543.1 

342819.1 

345248.1 

345469.1 

356839.1 

370470.1 

37186J 



373061J 

376239.1 

385931.1 

38898.1 

393481.1 

433234.1 

593829.1 

755099.1 

757918J 

883303.1 



Accession 

R86913 R86901 H25352 R01370H43764 AW044451 W21298 
AW195262 R27868 AW81 1262 

AW963990AA078196 AW749482 AA077468BE151571 AA376917 
AW499536 AW499553 AW502138 AW499537 AW502136 AW501743 
AW749402 AW749403 Z45743 R80376 AA093358 

AW848047 AW848202 AW848631 AW848142 AW848702 AW848121 AW848632 AW848140 AW848571 

AW848009 AW848067 AW848069 AW848905 AW848214 

BE145964 BE146286 AW854564 

AW949013AA126111 

AA298489AA137165 

D60925 D60828 D80787 

F30364 F36559 T15435 

H51299 H44619 H46391 R85024 H51892 T72744 

W26333 R05358H44682 

Z44761 R25801 R11926 R35604 

AW749855 AA225995 AW750208 AW750206 

AI830417AA236612 

AA603305 AA244095 AA244183 

AA255652 AA28091 1 AW967920 AA262684 

AW978073 AW978072 AA807550 AA306567 

F30712 F35665 AW263888 AI904014 AI904018 AA336927 AA336502 

AA337476 AW966227 AA450376 AW960222 AA381051 

AA418703AA418711 BE071915BE071920BE071912 

AA884766 AW974271 AA592975 AA447312 

AA527941 AI810608 AI620190 AA635266 

AA534222AA632632T81234 

AA534489 AW970240 AW970323 

AA550114 AW974148 AA572946 

T85301 AW517087 AA601O54 BE073959 

AF080229 AF080231 AF080230 AF080232 AF0802.33 AF080234 BE550633 AI636743 AW614951 BE467547 
AI680833 AI633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 AI970376 
AI583718 AI672574 N25695 AW665466 AI818326 AA126128 A1480345 AW013827 AA248638 AI214968 
AA204735 AA207155 AA206262 AA204833 AW003247 AW496808 AI080480 A1631703 AI651023 A1867418 
AW818140 AA502500 AI206199 AI671282 AI352545 BE501030 AI652535 BE465762 AA206331 AW451866 
AA471088 AA206342 AA204834 AA206100 AW021661 AA332922 N66048 AA703396 H92278 AW139734 
H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 BE466611 AI206344 AA574397 AA348354 
AI493192 

AA743991 AA604852 AW272737 
AA613792 AW182329 T05304 AW858385 
BE177494 AW276909 AA632849 
T52172AF147324 T52248 
AA649530 AA659316 H64973 
AA744693 AW750059 
AW379879AI126285H12014 
A1475858AW969013 
AI480316AW847535 
M86178AI813822 D56993 
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BE077084 AW139953 AW863127 AWB05209 AW806204 AW806205 AW806206 AW80621 1 AW806212 
AWB06207 AW806208 AW806210 AJ907497 

AW838616 AW838660 BE1 44343 AI91 4520 AW888910 BE184854 BE1B4784 
AL133761 AL133767 

BE176479 BE176678 BE176357 BE176550 AW886079 BE176676 BE176615 BE1 76555 BE1 76489 BE176610 
BE176362 

AW894017 AW893956 AW894032 
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TABLE 5B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Tables 5, 6, and 7. For each predicted exon, we have listed the 
genomic sequence source used for prediction. Nucleotide locations of each predicted exon 
are also listed. 



Pkey. Unique number corresponding to an Eos probeset 

Ref : Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et ai" refers to the 

publication entitled The DNA sequence of human chromosome 22 " Dunham I. et a!.. Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey 


Ref 


Strand 


NLposition 


401045 


8117619 


Pius 


90044-90184,91111-91345 


401424 


8176894 


Plus 


24223-24428 


401451 


6634068 


Minus 


119926-121272 


401714 


6715702 


Plus 


96484-96681 


401747 


9789672 


Minus 


118595-1 1B^1 191 19-1 19244Jt9609-119761,12W^ 








131258,131866-131932,132451-132575,133580-134011 


401785 


7249190 


Minus 


165776-165996,166189-166314,166408-166569.167112-167268,167387-167469,168634-168942 


401819 


7467933 


Minus 


28217-28486 


402408 


9796239 


Minus 


• 110326-110491 


402444 


9796614 


Plus 


28391-28517 


402791 


6137008 


Minus 


51036-51207 


403047 


3540153 


Minus 


59793*59968 


403137 


9211494 


Minus 


92349-92572,92958-93084,93579-93712,93949-94072,94591-94748,95214-95337 


403721 


7528046 


Minus 


156647-157366 


403764 


7717105 


Minus 


118692-118853 


403797 


8099896 


Minus 


123065-125008 


404165 


9926489 


Minus 


69025-69128 


404210 


5006246 


Plus 


169926-170121 


404253 


9367202 


Minus 


55675-56055 


404561 


9795980 


Minus 


69039-70100 


404571 


7249169 


Minus 


112450-112648 


4047£1 


9856648 


Minus 


173763-174294 


404915 


7341766 


Minus 


100915-101087 


404939 


6862697 


Plus 


175318-175476 


405403 


6850244 


Minus 


37491-37670,40951-41031 


405685 


4508129 


Minus 


37956-38097 


405718 


9795467 


Plus 


113080-113266 


405793 


1405887 


Minus 


89197-89453 


405876 


6758747 


Plus 


39694-40031 


405917 


7712162 


Minus 


106829-107213 


406414 


9256407 


Plus 


49593-49850 


406554 


7711566 


Plus 


106956-107121 
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TABLE 6:286 GENES ENCODING EXTRACELLULAR OR CELL SURFACE 
PROTEINS UP-REGULATED IN PROSTATE CANCER COMPARED TO 
NORMAL ADULT TISSUES 

Table 6 shows 286 genes up-regulated in prostate cancer compared to normal adult tissues 
that are likely to be extracellular or cell-surface proteins. These were selected as for Table 5 
and the predicted protein contained a structural domain that is indicative of extracellular 
localization (e.g. egf, 7tm domains). 



Pkey: 


Unique Eos probeset identifier number 




ExAccn: 


Exemplar Accession number, Genbank acces 


sion number 


UnigenelD: 


Unigene number 




Unigene Title: 


Unigene gene title 




R1: 


Ratio of tumor to normal tissue 





Pkey 


ExAccn 


UnigenelD 


Uningene Title 


R1 

~- 


409361 


NM_005982 


Hs.54416 


sine oculis homeobox (Drosophiia) homolo 


4828 


409731 


AA125985 


Hs.56145 


thymosin, beta, identified in neuroblast 


4524 


400298 


AA032279 


Hs.61635 


six transmembrane epithelial antigen of 


43.48 


420154 


AI093155 


Hs.95420 


JM27 protein 


41.12 


426747 


AA535210 


Hs.171995 


katOkrein 3, (prostate specific antigen 


31.80 


400299 


X07730 


Hs.171995 


kaflikrein 3, (prostate specific antigen 


2451 


425075 


AA506324 


Hs.1852 


acid phosphatase, prostate 


2423 


424846 


AU077324 


Hs.1832 


neuropeptide Y 


2357 


405685 








20.90 


420757 


X78592 


Hs.99915 


androgen receptor (dihydrotestosterone r 


19.72 


418994 


AA296520 


Hs.89546 


selectin E (endothelial adhesion molecu) 


1956 


452792 


AB037765 


Hs.30652 


K1AA1344 protein 


17.39 


445472 


AB006631 


Hs.12784 


Homo sapiens mRNA for KIAA0293 gene, par 


17.00 


414565 


AA502972 


Hs.183390 


hypothetical protein FU13590 


1632 


431716 


D89053 


Hs268012 


fatty-acid-Coenzyme A iigase, long-chain 


16.60 


408430 


S79876 


Hs.44926 


dipeptidyipepttdase IV (CD26, adenosine 


1628 


408000 


L11690 


Hs.620 


bullous pemphigoid antigen 1 (230/240kO) 


1554 


430226 


BE245562 


Hs2551 


adrenergic beta-2-, receptor, surface 


15.40 


444484 


AK002126 


Hs.11260 


hypothetical protein FU11264 


14.76 


418601 


AA279490 


Hs.86368 


calmegin 


1456 


448999 


AF179274 


Hs22791 


transmembrane protein with EGF-like and 


1455 


416182 


NM_0O4354 


Hs.79069 


cyclin G2 


12.94 


420544 


AA677577 


Hs.98732 


Homo sapiens Chromosome 16 BAC done CIT 


12.79 


445413 


AA151342 


Hs.12677 


CGI-147 protein 


12.64 


453930 


AA419466 


Hs36727 


hypothetical protein FU10903 


1222 


440286 


U29589 


Hs.7138 


cholinergic receptor, muscarinic 3 


12.04 


452784 


BE463857 


Hs.151258 


hypothetical protein FU21062 


11.86 


450203 


AF097994 


Hs.301528 


l-kynurenine/alpha-aminoadipate aminotra 


11.68 


448045 


AJ297436 


HS20166 


prostate stem cell antigen 


1151 


449650 


AF055575 


Hs23838 


calcium channel, voltage-dependent, L ty 


11.18 


420381 


D50640 


Hs.337616 


phosphodiesterase 3B, cGMP-inhibited 


11.10 


425665 


AK001050 


Hs.159066 


hypothetical protein FU10188 


11.08 


425710 


AF030880 


Hs.159275 


solute carrier family, member 4 


11.08 


428728 


NM 016625 


Hs.191381 


hypothetical protein 


11.04 


407021 


U52077 




gbMuman marinerl transposase gene, comp 


11.02 


410733 


D84284 


Hs.66052 


CD38 antigen (p45) 


11.02 


452340 


NMJJ02202 


Hs5Q5 


ISL1 transcription factor, LIM/homeodoma 


10.85 


428819 


AL135623 


Hs.193914 


KIAA0575 gene product 


10.48 


421991 


NM_014918 


Hs.1 10488 


KIAA0990 protein 


10.04 


431217 


NM.013427 


Hs250830 


Rho GTPase activating protein 6 


9.75 


421470 


R27496 


Hs.1378 


annexin A3 


9.64 


409262 


AK000631 


Hs52256 


hypothetical protein FU20624 


9.45 


435980 


AF274571 


Hs.129142 


deoxyribonudease 11 beta 


924 


421246 


AW582962 


Hs.102897 


CGI-47 protein 


920 


410001 


AB041036 


Hs57771 


kaffikreinll 


9.03 


441791 


AW372449 


Hs.175982 


hypothetical protein RJ21 159 


9.02 
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404571 








456497 


AW967956 


Hs.123648 


ESTs, Weakly similar to AF108460 1 ubinu 


419968 


X04430 


Hs.93913 


intarteukin 6 (interferon, beta 2) 


433172 


AB037841 


Hs-102652 


hypothetical protein ASH1 


422631 


BE218919 


Hs-1 18793 


hypothetical protein FU 10688 


427674 


NM.003528 


Hs.2178 


• H2B histone famSy, member Q 


404915 








452259 


AA317439 


Hs.28707 


signal sequence receptor, gamma (translo 


452891 


N75582 


Hs212875 


ESTs, Weakly similar to DYH9.HUMAN CIUA 


439731 


AI953135 


Hs.45140 


hypothetical protein RJ14084 


419839 


U24577 


Hs.93304 


phospholipase A2, group VII (platelet-ac 


420120 


AL049610 


Hs.95243 


transcription elongation factor A (Sll)- 


424099 


AF071202 


Hs.139336 


ATP-binding cassette, sub-tamBy C (CFTR 


448706 


AW291095 


Hs21814 


interteukin 20 receptor, alpha 


410227 


AB009284 


Hs.61152 


exostoses (mufo'pleHike 2 


425211 


M18667 


Hs.1867 


progastricsin (pepsinogen C) 


441736 


AW292779 


Hs.169799 


ESTs 


419991 


AJ000098 


Hs.94210 


eyes absent (Drosophila) homotog 1 


425016 


BE245277 


Hs.154196 


E4F transcription factor 1 


424560 


AA158727 


Hs.150555 


protein predicted by clone 23733 


409110 


AA191493 


Hs.48778 


niban protein 


421566 


NM000399 


Hs.1395 


eariy growth response 2 (Krox-20 (Drosop 


431725 


X65724 


Hs.2839 


Nome disease (pseudogibma) 


425782 


U66468 


Hs.159525 


ceD growth regulatory with EF-hand doma 


427408 


AA583206 


Hs2156 


RAR-related orphan receptor A 


435604 


AA625279 


Hs26892 


uncharacterized bone marrow protein BM04 


415874 


AF091622 


Hs.78893 


KIAA0244 protein 


401451 








431778 


AL080276 


Hs268562 


regulator of G-protein signalling 17 


409089 


NMJH4781 


Hs.50421 


KIAA0203 gene product 


431992 


NM_002742 


Hs2891 


protein kinase C,mu 


404253 








421552 


AF026692 


Hs.105700 


secreted frizzled- related protein 4 


416806 


NM 000288 


Hs.79993 


peroxisomal biogenesis factor 7 


431958 


X63629 


Hs2877 


cadherin 3, type 1, P-cadherfn (placenta 


439366 


AF100143 


Hs.6540 


fibroblast growth factor 13 


416836 


D54745 


Hs.80247 


cholecystokinin 


433383 


AF034837 


Hs.192731 


double-stranded RNA specific adenosine d 


450728 


AW162923 


Hs25363 


preseniiin 2 (Alzheimer disease 4) 


413384 


NM_000401 


Hs.75334 


exostoses (multiple) 2 


423349 


AF010258 


Hs.127428 


homeoboxA9 


424800 


AL035588 


Hs.153203 


MyoD family inhibitor 


425451 


AF242769 


Hs.157461 


mesenchymal stem ceil protein DSC54 


447359 


NM 012093 


Hs.18268 


adenylate kinase 5 


410889 


X91662 


Hs.66744 


twist (Drosophila) homotog (acrocephatas 


408829 


NM_006042 


Hs.48384 


heparan sulfate (glucosamine) 3-O-sulfot 


453911 


AW503857 


Hs.4007 


Sarcolemmal-associated protein 


408875 


NM 015434 


Hs.48604 


OKFZP434B168 protein 


450480 


X82125 


HS25040 


zinc finger protein 239 


451684 


AF216751 


Hs26813 


COA14 


400301 


X03635 


Hs.1657 


estrogen receptor 1 


415077 


L41607 


Hs.934 


glucosamine (N-acatyt) transferase 2, 1 


418852 


BE537037 


Hs273294 


hypothetical protein FU20069 


446867 


AB007891 


Hs.16349 


KIAA0431 protein 


410232 


AW372451 


Hs.61184 


CGI-79 protein 


422762 


AL031320 


Hs.1 19976 


Human DNA sequence from done RP1-20N2 o 


450616 


AL133067 


Hs.302689 


hypothetical protein 


408621 


AI970672 


Ks.46638 


chromosome 11 open reading frame 8 


439671 


AW162840 


Hs.6641 


kinesin family member 5C 


410196 


AI936442 


Hs59838 


hypotheticat protein FU10808 


429170 


NM 001394 


Hs2359 


dual specificity phosphatase 4 


440738 


AI004650 


Hs225674 


WO repeat domain 9 


414342 


AA742181 


Hs.75912 


KIAA0257 protein 


422634 


NMJJ16010 


Hs.1 18821 


CGI-62 protein 


400268 








439569 


AW60ei66 


Hs.222399 


CEGP1 protein 


452823 


AB012124 


Hs.30696 


transcription factor-like 5 (basic hefix 


431938 
427638 


AA938471 


HS54431 


specific granule protein (28 kOa); cyste 


AA406411 


HS.208341 


ESTs, Weakly similar to KIAA0989 protein 



8.66 

8.56 

8.36 

8.30 

8.27 

820 

8.08 

6.06 

8.02 

7.98 

7.68 

7.64 

7.64 

7.52 

7.49 

7.35 

728 

720 

720 

7.18 

7.10 

7.Q4 

6.98 

6.85 

6.79 

6.73 

6.54 

6.52 

651 

650 

6.49 

6.42 

6.41 

6.38 

6.30 

6.30 

6.30 

6.29 

625 

622 

620 

6.18 

6.14 

6.00 

5.97 

5.94 

5.94 

5.92 

5.90 

5.88 

5.78 

5.74 

5.72 

5.72 

5.70 

5.70 

5.70 

5.65 

5.64 

5.60 

5.60 

5.60 

5.59 

5.56 

5.55 

551 

5.48 

5.44 

5.42 
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421264 


AL039123 


Hs.103042 


miaotubule-assotiated protein 1B 


5.38 


421685 


AF1 89723 


Hs.106778 


ATPase, Ca++ transporting, type 2C, memb 


5.37 


421987 


AI133161 


Hs285131 


CGI- 101 protein 


5.36 


422806 


BE314767 


Hs.1581 


glutathione S-transferase theta 2 


5.34. 


432281 


AK001239 


HS574263 


hypothetical protein FU10377 


5.32 


451982 


F13036 


Hs27373 


Homo sapiens mRNA; cDNA DKFZp56401763 (f 


5.32 


444042 


NM 004915 


Hs.10237 


ATP-binding cassette, sub-family G (WHIT 


5.31 


447752 


M73700 


Hs.105938 


lactotransferrtn 


529 


451418 


BE387790 


Hs.26369 


hypothetical protein RJ20287 


5.22 


428593 


AW207440 


Hs.185973 


degenerative spermatocyte (homobg Droso 


521 


447541 


AK000288 


Hs.18800 


hypothetical protein FLJ20281 


5.18 


459294 


AW977286 


Hs.17428 


RBPI-like protein 


5.16 


424692 


AM29834 


Hs.151791 


KIAA0092 gene product 


5.15 


416434 


AW163045 


Hs.79334 


nuclear factor, interieukin 3 regulated 


5.11 


410268 


AA316181 


Hs.61635 


six transmembrane epithelial antigen of 


5.10 


417517 


AF001176 


Hs.82238 


POP4 (processing of precursor , S. cerev 


5.10 


453616 


NM_003462 


Hs.33846 


dynein, axonemal, fight intermediate pol 


5.10 


427958 


AA41800O 


Hs.98280 


potassium intermediate/small conductance 


5.09 


407945 


X69208 


Hs.606 


ATPase, Cu++ transporting, alpha polypep 


5.08 


418576 


AW968159 


Hs.289104 


Alu-binding protein with zinc finger dom 


5.05 


413328 


Y15723 


Hs.75295 


guanylate cyclase 1 , soluble, afcha 3 


5.04 


432729 


AK00Q292 


Hs.278732 


hypothetical protein RJ20285 • 


5.Q4 


426342 


AF093419 


Hs.169378 


multiple POZ domain protein 


5.02 


429782 


NM_005754 


Hs.220689 


Ras-GTPase-activating protein SH3-domain 


5.02 


436209 


AW850417 


Hs.254020 


ESTs, Moderately similar to unnamed prot 


5.02 


430599 


NM 004855 


Hs.247118 


phosphatidyttnositol glycan, class B 


5.00 


451386 


AB029006 


HSJ26334 


- spastic paraplegia 4 (autosomal dominant 


5.00 


457211 


AW972565 


Hs.32399 


ESTs, Weakly similar to S51797 vasodilat 


4.97 


425851 


NMJX31490 


Hs.159642 


glucosaminyl (N-acetyl) transferase 1, c 


4.97 


421689 


N87820 


Hs.106826 


KIAA1696 protein 


453 


416533 


BE244053 


Hs.79362 


retinoblastoma-like 2 (p130) 


4.92 


432653 


N62096 


Hs.293185 


ESTs, Weakly similar to JC7328 amino aci 


4.91 


403047 








4.91 


431117 


AF003522 


Hs.250500 


delta (Drosophila)-like 1 


4.90 


427617 


D42063 


Hs.199179 


RAN binding protein 2 


4.88 


428804 


AK000713 


Hs.193736 


hypothetical protein FU20706 


4.89 


449071 


NM 005872 


Hs.22960 


breast carcinoma amplified sequence 2 


4.86 


407596 


R86913 




gb:yq30f05.r1 Soares fetal Over spleen 


4.84 


456516 


BE172704 


Hs.222746 


KIAA1610 protein 


4.84 


458339 


AW976853 


Hs.172843 


ESTs 


4.83 


422083 


NM.001141 


Hs.111256 


arachxtonate 15-tipoxygenase, second typ 


4.82 


449535 


W15267 


Hs.23672 


low density lipoprotein receptor-related 


4.82 


422048 


NMJH2445 


HS588126 


spondin 2, extracellular matrix protein 


4.82 


424602 


AK002055 


Hs.151046 


hypothetical protein FU11193 


4.78 


410765 


AI694972 


Hs.66180 


nucteosome assembly protein Mike 2 


4.77 


419879 


Z17805 


Hs.93564 


Homer, neuronal immediate early gene, 2 


4.74 


450649 


NM.001429 


Hs25272 


E1 A binding protein p300 


4.74 


411624 


BE145964 


Hs.103283 


KIAA0594 protein 


4.72 


404721 








4.70 


426261 


AW242243 


Hs.168670 


peroxisomal famesylated protein 


4.70 


416276 


U41060 


Hs.79136 


UV-1 protein, estrogen regulated 


4.64 


408374 


AW025430 


Hs.155591 


forkhead box F1 


4.64 


451900 


AB023199 


Hs.27207 


K1AA0982 protein 


4.63 


421437 


AW821252 


Hs.104336 


hypothetical protein 


4.63 


434629 


AA789081 


Hs.4029 


glioma-amplified sequence-41 


4.60 


403764 








458 


421247 


BE391727 


Hs.102910 


general transcription factor UH, polype 


4.53 


403721 








4.50 


453070 


AK001465 


Hs.31575 


SEC63, endoplasmic reticulum translocon 


4.49 


417412 


X16896 


Hs.82112 


interieukin 1 receptor, type 1 


4.48 


439735 


, AI635386 


Hs.142846 


hypothetical protein 


4.48 


430261 


AA305127 


Hs.237225 


hypothetical protein HT023 


4.46 


430598 


AK001764 


Hs.247112 


hypothetical protein FLJ10902 


4.44 


400303 


AA242758 


Hs.79136 


UV-1 protein, estrogen regulated 


4.42 


438209 


AL120659 


Hs.6111 


aryt-hydrocamon receptor nuclear transl 


4.42 


417421 


AL138201 


Hs.82120 


nuclear receptor subfamily 4, group A, m 


4.40 


447270 


AC002551 


Hs.331 


general transcription factor HIC, polyp 


4.38 


434423 


NM.006769 


Hs.3844 


LIM (terrain only 4 


4.35 


404561 
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422969 


AA782538 


Hs.1 22647 


N-myristoyttransferase 2 


4.32 


423685 


BE350494 


Hs.49753 


uveal autoantigen with coiled coB domai 


4.32 


425071 


NMJM3989 


. Hs. 154424 


dekxfinase, iodothyronine, type II 


4.32 


431583 


AL042613 


Hs.262476 


S-adenosylmethionine decarboxylase 1 


4.31 


442818 


AK001741 


Hs.8739 


hypothetical protein RJ10879 


4.30 


423740 


Y07701 


Hs.293007 


am'mopepWase puramydn sensitive 


4.24 


424701 


NNL005923 


Hs.1 51 988 


mttogen-activated protein kinase kinase 


421 


424085 


NMJXE914 


Hs.139226 


replication factor C (activator 1 ) 2 (40 


420 


410294 


AB014515 


Hs.323712 


KIAA0615 gene product 


4.18 


447124 


AW976438 


Hs.17428 


RBPUike protein 


4.18 


438018 


AK001160 


Hs.5999 


hypothetical protein FU10298 


4.16 


443857 


A1089292 


Hs.287621 


hypothetical protein FU14069 


4.15 


446711 


AF169692 


Hs.12450 


protocadherin 9 


4.15 


405403 








4.14 


448148 


NM_016578 


Hs.20509 


HBV pX associated protein-8 


4.13 


417531 


NM_003157 


Hs.1 087 


serine/threonine kinase 2 


4.12 


433345 


AI681545 


Hs.152982 


hypothetical protein FLJ131 17 


4.10 


432712 


AB016247 


Hs.288031 


stero!-C5-desaturase (fungal ERG3, delta 


4.09 


435114 


AA775483 


Hs.288936 


mitochondrial ribosomal protein L9 


4.08 


445459 


AI478629 


Hs.158465 


lately ortholog of mouse putative IKK re 


4.08 


402791 








4.04 


438660 


U95740 


Hs.6349 


Homo sapiens, clone IMAGE3010666, mRNA, 


4.Q4 


447568 


AF155655 


Hs.18885 


CGM 16 protein 


4.04 


452211 


AI985513 


HS233420 


ESTs 


4.02 


443292 


AK000213 


Hs.9196 


hypothetical protein 


4.01 


420911 


U77413 


Hs.100293 


O-Gnked N-acetytglucosamine (GlcNAc) tr 


4.00 


428738 


NMJXJ0380 


Hs.1 92803 


xeroderma pigmentosum, complementatiori g 


3.95 


430456 


AA314998 


HS241503 


hypothetical protein 


3.95 


437531 


A1400752 


Hs.1 12259 


T cell receptor gamma locus 


3.93 


428695 


A1355647 


Hs.189999 


purmergic receptor (family A group 5) 


3.91 


410011 


AB020641 


Hs£7856 


PFTAIRE protein kinase 1 


3.91 


446494 


AA463276 


Hs.288906 


WW Domain-Containing Gene 


3.91 


409928 


AL137163 


Hs.57549 


hypothetical protein dJ473B4 


3.90 


411598 


BE336654 


Hs.70937 


H3 histone family, member A 


3.90 


425707 


AF1 15402 


Hs.11713 


E74-fike factor 5 (ets domain transcript 


3.90 


451806 


NMJJG3729 


Hs£7076 


RNA 3'-terminal phosphate cyclase 


3.89 


401045 








3.89 


437372 


AA323S68 


Hs^B3631 


hypothetical protein 0KFZp547G183 


3.89 


417067 


AJ001417 


Hs.81086 


solute carrier family 22 (extraneurona! 


3.88 


410467 


AF1Q2546 


Hs.63931 


dachshund (Drosophila) homoiog 


3.88 


431930 


AB035301 


Hs2722t1 


cadherin7,type2 


3.88 


453047 


AW023798 


Hs£86Q25 


ESTs 


3.88 


401785 








3.88 


458229 


AI929602 


Hs.177 


phosphatidyiinositol glycan, dass H 


3.86 


406414 








3.86 


412494 


AL133900 


Hs.792 


ADP-ribosy!ation factor domain protein 1 


3.84 


418329 


AW247430 


Hs.84152 


cystathionine-beta-synthase 


3.83 


424850 


AA151057 


Hs.153498 


chromosome 18 open reading frame 1 


3.82 


427585 


031152 


Hs.179729 


collagen, type X, alpha 1 (Schmid metaph 


3.82 


423052 


M28214 


Hs.123072 


RAB3B, member RAS oncogene family 


3.82 


416111 


AA033813 


Hs.79018 


chromatin assembly factor 1, subunit A ( 


3.82 


419423 


D26488 


Hs.90315 


KIAA0007 protein 


3.80 


429643 


AA455889 


Hs.167279 


FYVE-finger-containing Rab5 effector *pro 


3.80 


431499 


NM 001514 


Hs.258561 


general transcription factor IIB 


3.80 


444078 


BE246919 


Hs.10290 


U5 snRNP-speofic 40 kDa protein (hPrp8- 


3.78 


430291 


AV660345 


Hs.238126 


CGI-49 protein 


3.76 


431637 


AI879330 


Hs.265960 


hypothetical protein FIJI 0563 


3.74 


440411 


N30256 


Hs.151093 


hypothetical protein DKFZp434G1415 


3.74 


405917 








3.74 


451230 


BE546208 


Hs.26090 


hypothetical protein RJ20272 


3.73 


429597 


NM_0Q3816 


Hs.2442 


a (fisintegnn and metaHoproteinase doma 


3.73 


415075 


L27479 


Hs.77889 


Friedreich ataxia region gene X123 


3.72 


440351 


AF030933 


Hs.7179 


RAD1 (S.pombe) homoiog 


3.70 


443603 


BE502601 


Hs.134289 


ESTs, WeaWy similar to KIAA1063 protein 


3.70 


446965 


BE242873 


Hs.16677 


WO repeat domain 15 


3.70 


412350 


AI659306 


Hs.73826 


protein tyrosine phosphatase, non-recept 


3.70 


433852 


A1378329 


Hs.126629 


ESTs 


3.70 


447397 


BE247676 


Hs.18442 


E-1 enzyme 


3.68 


405718 








3.68 
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425217 


AU076696 


Hs.1 55174 


CDC5 (cell division cycle 5, S. pombe, h 


O CQ 

o.bo 


421734 


1|AJ AAA J 

AI318624 


Hs.1 07444 


Homo sapiens cDNA PU20552 ris, done ka 


o en 


427221 


115409 


HS.1 74007 


von Hippel-Lindau syndrome 


O C7 

3.0/ 


402408 








3.66 


452946 


X95425 


II _ A4AAt 

Hs.31092 


cpnA5 


Q CA 


419078 


M93119 


Hs.89584 


insuDnoma-assotiatad 1 


o cc 
o.bo 


427144 


X95097 


1 I* /\ AAA 

Hs.2126 


vasoactive intestinal peptide receptor 2 


3.65 


423396 


AI382555 


Hs.127950 


bromodomain-containing 1 


o.bo 


446320 


AF1 26245 


Hs.14791 


acyl-Coenzyme A dehydrogenase family, me 


O CO 

3.63 


404939 








O CO 


403137 








o cn 
o.bU 


437162 


AW005505 


Hs.5464 


thyroid hormone receptor coactivating pr 


3.60 


404210 








3.59 


443775 


AF291664 


Hs.204732 


matrix metailoproteinase 26 


, 3.56 


452501 


AB037791 


U. AA74 C 

HS.29716 


hypothetical protein rU 10980 


o cc 


422443 


NM_014707 


HSJ 16753 


histone deacetytase 7B 


3.55 


420230 


AL034344 


H&2o418o 


lorknead box CI 


o cc 

3.55 


418428 


Y12490 


II _ grAAA 

H5.85092 


thyroid hormone receptor interactor 1 1 


3.54 


433002 


AF048730 


HS279906 


cycfinTI 


3.53 


405793 










457940 


AL360159 


Hs.306517 


Homo sapiens TRIpartite motif protein ps 


3.52 


402444 








3.52 


418250 


U29926 


Hs.83918 


adenosine monophosphate deaminase (tsofo 


3.51 


414222 


AL135173 


Hs.878 


sorbitol dehydrogenase 


351 


422384 


AA224077 


Hs.42438 


Sm protein F 


3.50 


447805 


AW627932 


Hs.19614 


gemin4 


3.50 


454265 


H03556 


Hs.300949 


ESTs, Weakly similar to thyroid hormone 


3.50 


423445 


NMJH4324 


Hs.128749 


alpha-methylacyi-CoA racemase 


3.48 


413435 


X51405 


Hs.75360 


carboxypepUdase E 


3.46 


447210 


AF035269 


Hs.17752 


pnosphatidytserine-specflic phospholipas 


3.46 


426931 


NM 003416 


Hs.2076 


zinc finger protein 7 (KOX 4, done HF.1 


3.45 


408418 


AW963897 


Hs.44743 


KIAA1435 protein 


3.45 


421887 


AW161450 


Hs.109201 


CGI-86 protein 


3.44 
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Table 7: 42 GENES ENCODING SMALL MOLECULE TARGETS UP-REGULATED IN 

PROSTATE CANCER COMPARED TO NORMAL ADULT TISSUES 

5 Table 7 shows 42 genes up-regulated in prostate cancer compared to normal adult tissues that 
are likely to be small molecule targets. These were selected as for Table 5 and the predicted 
protein contained a structural domain that is indicative of a drugable structure (e.g. protease, 

kinase, phosphatase, receptor). The functional domain is indicated for each gene. 

10 Pkey: Unique Eos probeset identifier number 

ExAocn: Exemplar Accession number, Genbank accession number 

UnigenefD: Unigene number 

Unigene Title: Unigene gene title 

PSDomatn: Protein Structural Domain 
15 R1 : Ratio of tumor vs. normal tissue 

Pkey ExAccn UnigenelD Unigene Title PSDomaln R1 

20 428747 AA535210 Hs.1 71995 kaHikrein 3, (prostata specific antigen trypsin 3130 

400299 X07730 Hs.1 71995 kallikrein 3, (prostate specific antigen trypsin 24.91 

420757 X78592 Hs.99915 androgen receptor (dihydrotestosterone r Androgen_recep,hormone^rec,zf-C4 19.72 

408430 S79876 Hs.44926 dipeptidylpeptidase IV (CD26, adenosine DPPIV,N_term,Peptidase_S9 1628 

430226 BE245562 Hs2551 adrenergic, beta-2-, receptor, surface 7tmJ 15.40 

25 411096 U80034 Hs.68583 mitochondrialintermediate peptidase Peptidase^ 1431 

440286 U29589 Hs.7138 cholinergic receptor, muscarinic 3 7tm_1 12.04 

420381 D50640 Hs.337616 phosphodiesterase 3B, cGMP-inhibited PDEase 11.10 

407021 U52077 gb:Human marinerl transposase gene, comp SET.TransposaseJ 11.02 

401424 arginase 9.58 

30 410001 AB041036 Hs£7771 kallikrein 11 trypsin 9.03 

428330 L22524 Hs2256 matrix metailoproteinase 7 (matrilysin, Peptidase_M10 8.76 

424099 AF071202 Hs.139336 ATP-binding cassette, sub-family C {CFTR ABCJran,ABC_membrane 7.64 

419991 AJ000098 Hs.94210 eyes absent (DrosophOa) homolog 1 Hydrolase 720 

431992 NMJXJ2742 Hs2891 protein kinase C, mu pkinase,DAG_PE-bind,PH 6.49 

35 447359 NMJ)12093 Hs.18268 adenylate kinase 5 adenylatekinase . 6.00 

400301 X03635 Hs.1657 estrogen receptor 1 OesUecep^f-C4,hormone.rec 5.78 

421685 AF189723 Hs.106778 ATPase, Ca++ transporting, type 2C, memb E1-E2_ATPase,Hydrotase 5.37 

444042 NMJXM915 Hs.10237 ATP-binding cassette, sub-family G (WHIT ABCJran 5.31 

447752 M73700 Hs.105938 lactotransferrin transfemn,7tmj 529 

40 407945 X69208 Hs.606 ATPase, Cu++ transporting, alpha potypep E1 -E2_ATPase l Hydro!ase,HM A 5.08 

403047 trypsin 4.91 

427617 D42063 Hs.1 99179 RAN binding protein 2 Ran3P1^-RanBP,TPRj)ro_isomerase 4.88 

422083 NM.001141 Hs.1 11 256 arachidonate 15-lipoxygenase, second typ ttpoxygenase.PLAT 4.82 

449535 W15267 Hs23672 low density lipoprotein receptor-related ldLreceptJ>,ldLrecepLa,EGF 4.82 

45 425071 NM.013989 Hs.154424 deiodinase, lodothyronine, type II T4_deiodinase 4.32 

423740 Y07701 Hs293007 aminopeptidase puromycin sensitive Peptidase_M1 424 

424701 NM 005923 Hs.1 51 988 mitogen-activated protein kinase kinase pkinase 421 

424085 NMJXJ2914 Hs.139226 repMon factor C (activator 1)2 (40 AAA,ViraLh8licase1 420 

417531 NM_003157 Hs.1087 serine/threonine kinase 2 pkinase' 4.12 

50 428695 A1355647 Hs.189999 purinergic receptor (famBy A group 5) 7tmJ 3.91 

410011 AB020641 Hs£7856 PFTAIRE protein kinase 1 pkinase 3.91 

424850 AA151057 Hs. 1 53498 chromosome 18 open reading frame 1 IdLrecepLa 3.82 

412350 AI659306 Hs.73826 protein tyrosine phosphatase, non-recept Yj)hosphatase,Band_41,PD2 3.70 

447397 BE247676 Hs.18442 E-1 enzyme Hydrolase 3.68 

55 452946 X95425 Hs31092 EphA5 EPH_lbd,fn3,pldnase,SAM 3.66 

427144 X95097 Hs2126 vasoactive intestinal peptide receptor 2 7tm_2 3.65 

443775 AF291664 Hs204732 matrix metailoproteinase 26 Peptidase_M10 3.56 

457840 AL360159 Hs.306517 Homo sapiens TRIpartite motif protein ps SPRY,7tmJ 3.52 

418250 U29926 Hs.83918 adenosine monophosphate deaminase (isofo ^deaminase 351 

60 413435 X51405 Hs.75360 carboxypeptidase E ZrucarbOpept 3.46 

447210 AF035269 Hs.17752 fjhosphatidylserine-sp^ lipase 3.46 
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TABLE 8: 136 GENES SIGNIFICANTLY DOWN-REGULATED IN PROSTATE 
CANCER COMPARED TO NORMAL PROSTATE 

Table 8 shows 136 genes significantly down-regulated in prostate cancer compared to normal 
5 prostate . These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip 
array such that the ratio of "average" normal prostate to "average" prostate cancer tissues was 
greater than or equal to 2. The "average" normal prostate level was set to the mean amongst 4 
normal prostate tissues. The "average" prostate cancer level was set to the 85 th percentile 
amongst 73 tumor samples. In order to remove gene-specific background levels of non- 
10 specific hybridization, the 10 th percentile value amongst all the tissues was subtracted from 
both the numerator and the denominator before the ratio was evaluated. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

15 UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: Ratio of normal prostate to prostate cancer 



Pkey ExAccn UnigenelD Unigene Title R1 

425932 M81650 Hs.1968 semenogelinl 57.69 

425545 N98529 Hs. 158295 Human mRN A for myosin fight chain 3 (MLC 19.70 

426752 X69490 Hs.172004 titin 15.25 

442082 R41823 Hs.7413 ESTs; calsyntenin-2 10.05 

25 407245 X90568 Hs.172004 titin 9.38 

422711 D60641 Hs21739 Homo sapiens mRNA; cDNA DKFZp586l1518 (f 9.05 

420813 X51501 Hs.99949 prolactin-induced protein 8.18 

411987 AA375975 Hs.183380 "ESTs, Moderately similar to AIU7_HUMAN 7.45 

404567 5.62 

30 416030 H15261 Hs21948 ESTs 5.51 

444892 AI620617 Hs.148565 ESTs 527 

444573 AW043590 Hs225Q23 ESTs 520 

428068 AW016437 Hs233462 ESTs 5.08 

437440 AA846804 Hs.123694 ESTs 4.95 

35 404113 4.75 

452279 AA286844 Hs.61260 hypothetical protein RJ 131 64 4.75 

421058 AW297967 Hs.188181 ESTs 4.63 

445592 AV654382 Hs.17947 'ESTs, Weakly similar to K02F3.10 [Cele 4.53 

405163 4.49 

40 405227 4.45 

454059 NMJ)03154Hs.37048 stetherin 4.45 

450152 A1138635 Hs22968 ESTs 4.40 

407013 U35637 "gb:Human nebufin mRNA, partial cds' 4.03 

403612 4.02 

45 440089 AA864468 Hs.135646 ESTs 4.00 

408988 AL1 19844 Hs.49476 Homo sapiens clone TUA8 CrHJu-chat regi 3.9B 

436726 AA324975 Hs.128993 "ESTs, Weakly similar to KIAA0465 protei 3.95 

459367 BE148877 "p>:CM4-HT0244-111199^4r>h12HT0244Hom 3.95 

427318 AF1 86081 Hs.175783 zinc transporter 352 

50 411762 AW860972 "gb:QVf>CT038M80300-167-h07CT0387Hom 3.85 

418668 AW407987 Hs.87150 Human done A9A2BR1 1 (CAC)n/(GTG)n repea , 3.75 

458311 AF069478 *gb:AF069478 Homo sapiens astrocytoma li 3.61 

403649 3.60 

419682 H13139 Hs.92282 paired-fike homeodomain transcription fa 358 

55 412519 AA196241 Hs.73980 "troponin T1, skeletal, slow" 351 

414206 AW276887 Hs.46609 ESTs 3.45 

427419 NMJXJ0200HS.177688 histatin3 3.37 

420777 AA280223 Hs.130865 ESTs 3.35 

428134 AA421773 Hs.161008 ESTs 3.31 

60 450218 R02018 Hs.168640 "Ank, mouse, homotog of 3.30 

433474 AI192195 Hs.147174 "EST, Highly similar to ubiquitin-protel 3.30 

418833 AW974899 Hs292776 ESTs 326 

400440 X83957 Hs.83870 nebufin 3.16 
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413778 AA09Q235 Hs.75535 'myosin, light polypeptide 2. regulatory 3.06 

423151 AW838068 •gb:QV3^T0048^)10300-109-f02 LT0048 Horn 3.05 

445060 AAB30811 Hs38808 ESTs 2.98 

457065 AI476318 Hs.1 92480 ESTs 2.95 

5 432456 H00093 "^^hBf^.ig/IWOutvrardAlu-primedhn 2.92 

405678 2.85 

406707 S73840 Hs531 "myosin, heavy polypeptide 2, skeletal m 2.81 

444105 AW189097 Hs.166597 ESTs 2.78 

433968 AL157518 HS30421 PR02463 protein 2.73 

10 438522 AA809431 Hs258886 ESTs 2.73 

436562 H71937 Hs.1 69756 "complement component 1,s subcomponent" 2.68 

412417 AA1 02268 H&42175 ESTs 2.67 

455590 BE072259 "gb.-QV4-BT0538-27l299-05^g04 BT0536 Horn 2.65 

415380 F07953 Hs. 16085 putative G-protetn coupled receptor 2.65 

15 428729 AL162331 Hs.191436 hypothetical protein RJ 106 19 2.64 

408537 AW207734 "gb:UI-H-Bl2-age-IH)1-04J1.s1 NCI_CGAP_S 2.63 

424706 AA741336 Hs.152108 transcriptional unit N143 2.63 

413212 BE072092 *gb:PM4*T0532-1602(XM)03-b11 BT0532Hom 2.63 

406704 M21665 Hs.929 "myosin, heavy polypeptide 7, cardiac mu 2.62 

20 437507 AA758538 Hs246882 ESTs 2.60 

410384 AI933794 Hs.42745 ESTs 2.58 

408074 R20723 Hs.124764 ESTs 2.58 

436653 AA829828 Hs2924Q2 ESTs 2-52 

458090 AI282149 Hs£6213 "ESTs, Highly similar to FXD3_HUMAN FORK 2.51 

25 432003 AI689154 Hs.122972 ESTs 2.50 

436915 AA737400 Hs.142230 ESTs 2.50 

410028 AW576454 Hs258553 ESTs 2.46 

448920 AW408009 Hs22580 alkytgrycerone phosphate synthase 2.45 

422046 A1638562 'gb:ts50a10.x1 NCLCGAPJM Homo sapiens 2.44 

30 451122 AA015767 Hs.193587 ESTs 2.40 

422646 H87863 Hs.151380 ESTs 2.36 

451237 AW600293 *gb:EST00049 pGEM-T library Homo sapiens 2.36 

400001 AFFX control: BioB-3 2.36 

415835 245365 'gb:HSC2NF061 normalized infant brain cO 2.36 

35 439706 AW872527 Hs£9761 ESTs 2.36 

423341 AW242394 HS252495 ESTs 2.36 

436486 AA742221 Hs.120633 ESTs 2.35 

407449 AJ002784 gfcHomo sapiens mRNA; fetal brain cDNA 5 2.33 

430573 AA744550 Hs.1 36345 ESTs 2.32 

40 401974 . 2.31 

443356 AL044498 Hs.133262 'ESTs, WeaWy similar to PH021 7 reverse 2.31 

430751 NM_012471Hs247868 transient receptor potential channel 5 2.25 

439128 AI949371 Hs.153089 ESTs 225 

448765 R15337 Hs21958 'Homo sapiens cDNA FU10532 fis. clone N 225 

45 451130 AI762250 H&211347 ESTs 224 

405420 223 

455029 AW851258 "p£:ILM)TQ22Q-160200-OS6-H06 CT0220 Horn 223 

438224 AA933999 "gb:on91f04.s1 Soares_NR_T_GBC_S1 Homo 223 

407764 BE008347 "gb.-CMO-BN0154-080400-325^04BN0154Hom 223 

50 413549 BE252470 "gb:601 108292F1 NIHLMGCJ 6 Homo sapiens 223 

437010 AA741368 Hs291434 ESTs 223 

435111 AI914279 HS213740 ESTs 2.22 

403375 221 

455060 AW853441 "gb:RC1-CT0252-Q30100-02frg09CT0252 Horn 221 

55 409792 AW854153 'gb:RC^0254-060400-029KiWCT0254Hom 220 

421154 AA284333 Hs287631 'Homo sapiens cDNA FU14269 fis, clone P 2.19 

401963 2.18 

435034 AF168711 Hs.159397 x 010 protein 2.18 

448996 AW998989 Hs.105749 KIAAQ553 protein 2.18 

60 436816 AW297599 Hs255667 ESTs 2.17 

442252 AI733395 Hs.129124 ESTs 2.17 

419310 AA236233 Hs.188716 ESTs 2.16 

418579 H91800 Hs.124156 ESTs 2.16 

423315 R54109 Hs26096 ESTs 2.16 

65 432744 AA988835 Hs.38664 ESTs 2.15 

424492 AI133482 Hs.165210 ESTs 2.15 

424770 AA425562 "gb2w46e05.r1 Scares jDtaJJetus_Nb2HF8 2.15 

437101 AA744518 Hs.1 20610 ESTs 2.15 

428793 AC004957 Hs298975 "ESTs, Highly simBaj to coflapsin-2-lik 2.15 
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415708 H56475 "gb:yt87d11.M Soares_ptnsaIjland_N3HPG 2.13 

459619 2.12 

427506 AK000134 Hs.179100 hypothetical protein RJ20127 2.12 

452508 AA804174 Hs. 184354 ESTs 2.10 

410881 AW809157 'gfcRCO-STOI 1 8-041099-031 <07J ST01 18 Homo sapiens cONA. mRNA sequence" 2.10 

403087 2.10 

403869 2.10 

445028 D81194 Hs.282499 ESTs 2.10 

447884 H29505 "gb:ym60d10j1 Soares infant brain 1NIB Homo sapiens cDNA clone 5*. mRNA sequence" 2.10 

414575 H11257 Hs.295233 ESTs 2.09 

420351 BE218221 Hs.190044 ESTs 2.08 

426998 BE274360 'gb:601 121068F1 NIH MGC 20 Homo sapiens cONA clone 5', mRNA sequence' 2.08 

405455 * 2.08 

423843 AA332652 "gb:EST36627 Embryo, 8 week I Homo sapiens cDNA 5* end similar to similar to 

monoamine oxidase B. mRNA sequence" 2.08 

406135 2.07 

427046 BE246180 Hs.121385 ESTs 2.07 

403493 2.05 

444514 AI682905 Hs270431 "ESTs, Weakly similar to ALU1 HUMAN ALU SUBFAMILY J SEQUENCE 

CONTAMINATION WARNING ENTRY [H.sapiensr 2.05 

435884 AA701443 Hs.192868 ESTs 2.05 

419629 AB020695 Hs.91662 KIAA0888 protein 2.03 

405900 2.03 

457350 AW974438 Hs.194136 "ESTs, Moderately similar to AF091457 1 zinc finger protein RIN ZF [R.norvegicusr 2.02 

400007 AFFX control: BioDn-5 2.01 

406978 M64358 "gb:Human rtiom-3 gene, exon." 2.00 
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TABLE 8A shows the accession numbers for those primekeys lacking a unigeneK) in Table 
8. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accessions 



407764 1014849 J BE008347 BE0Q8320 BE083307 BE083311 AW075968 

408537 1064753J AW207734D60164D81150D81078 D61356AW998804 

409792 1154677 1 AW854153 AW50Q210BE1 45772 AW501 310 

410881 1225682 1 AW809157AW812181 AW812175 AW812172 AW812161 AW812165 

411762 1256906J AW860972 AW862598 AW662599 AW860988 AW860983 AW860898 AW860925 AW860922 AW860986 AW860984 AW860989 

413212 1353792 1 BE072092 BEO72106 BE072086 BE072098 BE072103 

413549 1375933 2 BE252470 BE1 47573 

415708 1548209.1 H56475 F29401 F34552 

415835 1558511J 245365 R25905 H05203 T77496 

422046 210744J AI638562 T16929 H13401 F07773 R55836 

423151 225415J AW838068 AW837986 AW838067AA322487AW837936 

423843 232510J AA332652 AA331633 AW999369 AW902993 BE170475 AA378845 AW964175 AI475221 

424770 243504J AM25562 AI88O208 AA346646 N22655 AW81 1775 AW81 1786 

426998 274259.-1 BE274360 

432456 347718 2 H00093 H00079 H00070 H00054 H00049 H00063 AW905306 AW905241 AW905410 AW905307 AW90541 1 AW90524O 
AW905210 

AW905352 AW905304 AW905239 AW905242 AW905243 H00087 

452656J AA933999AA781181 

740749J H29505 R 18575 Z43580 T48738 AI435454 BE004683 

863269.1 AW600293AI767468 

1249374J AW851258AW851435AW851106AW851421 

1251259 1 AW853441 BE145228 BE145218 BE145162 BE145283 

1335127J BE072259 BE072230 BE00791 1 

543550.1 AF069478 AF069479 AF069480 



438224 
447884 
451237 
455029 
455060 
455590 
458311 
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TABLE 8B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in table 8. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Gertbank Identifier (G!) numbers. "Ounham I. et aL* refers to the 

publication entitled The ONA 

sequence of human chromosome 22 " Dunham I. et at, Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey 


Ret 


Strand 


NLposition 


401963 


3126783 


Plus 


51382-51521 


401974 


3126777 


Plus 


85330-85683 


403087 


8954241 


Plus 


169511-169795 


403375 


9255944 


Minus 


92554-92795 


403493 


7341425 


Plus 


157568-159084 


403612 


8469060 


Minus 


94723-94859 


403649 


8705159 


Minus 


27141-27247 


403869 


7280046 


Minus 


34379-34583 


404113 


9588571 


Minus 


13446-13646 


404567 


7249169 


Minus 


101320-101501 


405163 


9966267 


Minus 


161 171-161299 


405227 


6731245 


Minus 


22550-22802 


405420 


7211837 


Minus 


13428-13582 


405455 


7656675 


Plus 


134112-134671 


405678 


4079670 


Plus 


151821-152027 


405900 


6758795 


Minus 


71181-71535 


406135 


9164918 


Minus 


65489-65715 
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TABLE 9: 1001 GENES SIGNIFICANTLY UP-REGULATED IN NORMAL PROSTATE 
COMPATED TO PROSTATE CANCER 

Table 9 shows 1001 genes significantly up-regulated in prostate cancer compared to normal 
prostate. These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip 
array such that the ratio, of "average" normal prostate to "average" prostate cancer tissues was 
greater than or equal to 8.14. The "average" normal prostate level was set to the mean 
amongst 4 normal prostate tissues. The "average" prostate cancer level was set to the 85 th 
percentile amongst 73 tumor samples. In order to remove gene-specific background levels of 
non-specific hybridization, the lO* percentile value amongst all the tissues was subtracted 
from both the numerator and the denominator before the ratio was evaluated. 



15 



20 



25 
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55 
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Pkey: 


Unique Eos probeset Identifier number 




ExAccn: 


Exemplar Accession number, Genbank accession number 




UnigenelD: 


Unigene number 




Unigene "Title: 


Unigene gene title 




R1: 


Ratio of prostate cancer to normal prostate 




Pkey ExAccn 


UnigenelD Unigene Title 


R1 


451002 AA013299 


Hs.8018 ESTs, Weakly similar to ALU3_HUMAN ALU S 


1684.00 


435596 AA689465 


Hs.188999 ESTs 


738.00 


443576 AIO78027 


Hs.169338 ESTs 


246.86 


434247 AA928116 


Hs272065 ESTs 


245.20 


400452 AK000165 


gb:Homo sapiens cDNA FU20178 lis, done 


222.00 


405932 


221.33 


427906 AAB54330 


Hs.1 66520 ESTs 


212.00 


443685 Al 68 5550 


Hs.1 74481 ESTs 


• 163.20 


451554 AJ474866 


Hs.1 93237 ESTs 


149.45 


418323 NMJW2118 


Hs.1 162 major histocompatibility complex, class 


126.11 


429480 M36860 


Hs.9295 etastin (supravatvular aortic stenosis, 


12357 


426025 AW138330 


Hs.233778 ESTs 


120.00 


418917 X02994 


Hs.1217 adenosine deaminase 


106.75 


404407 




105.71 


442027 AJ652926 


Hs.128395 ESTs 


10053 


433704 AA608684 


Hs.121705 ESTs, Moderately similar to ALUC.HUMAN I 


94.00 


453758 U83527 


gb:HSU83527 Human fetal brain (Miovett) 


89.18 


415354 F06495 


gb:HSC1ABG51 normalized infant brain cDN 


87.73 


424239 M67439 


Hs.143526 dopamine receptor D5 


86.82 


444143 AW747996 


Hs.160999 ESTs 


86.43 


401672 




7726 


430590 AW383947 


Hs.246381 CD68 antigen 


68.47 


411972 BE074959 


gb:PMO-BTG5B2-31010(MX)1-f08 BT0582 Homo 


68.00 


448992 AI766053 


Hs.188346 ESTs 


6126 


408828 BE540279 


gb:601059857F1 NIH_MGC_10 Homo sapiens c 


57.71 


409653 AW451693 


Hs.220826 ESTs 


* 56.40 


402964 




54.67 


422673 N59027 


gb:yv59d1 1 .r1 Soares fetal Ever spleen 


54.00 


422568 AA372275 


Hs.279800 Homo sapiens cDNA FU1 1383 fis, done HE 


54.00 


438907 R32704 


Hs.301298 ESTs 


5236 


405172 




5236 


444897 AW137088 


Hs.144857 ESTs 


52.32 


458019 AW592931 


Hs.256298 ESTs 


51.63 


405275 AB028989 


Hs.88500 mitogen-activated protein kinase 8 inter 


50.98 


457815 AA703679 


Hs.106999 ESTs, Weakly simflar to SYT5_HUMAN SYNAP 


49.60 


424385 AA339666 


gb£ST44776 Fetal brain ! Homo sapiens c 


4830 


407172 T54095 


gb:ya92c05.s1 Stratagene placenta (93722 


47J98 


428202 AA424163 


Hs.156895 ESTs 


46.83 


435672 AI700148 


Hs.283626 ESTs 


4357 


420283 AA485224 


Hs57734 G proteirwxmpted receptor kinase-intera 


43.00 


417016 AA837098 


Hs.269933 ESTs 


42.70 


438854 AF074994 


Hs24240 ESTs 


4257 
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406134 42.43 

457319 AA480895 Hs20t552 ESTs, Weakty simflar to T17288 hypotheti 4231 

409314 AA070266 gbzm69d04 jl Stratagene neuroepitheBum 4225 

401124 41j61 

5 429316 AI371157 Hs.178538 ESTs 40.00 

420317 AB006628 Hs.96485 KIAAQ290 protein 39.64 

457586 AW062439 gbAflR0<;T0060-120899«001-f08CT0060Horno 39.60 

417407 AA923278 Hs.290905 ESTs, Weakly similar to protease [H.sapl 38.73 

430269 BE221682 Hs.178364 ESTs 38.06 

10 439602 W79114 Hs.58558 ESTs 3659 

433686 AA604799 Hs.136528 ESTs, Moderately similar to ALU1_HUMAN A 3629 

417993 AW963705 Hs295806 ESTs, Weakly similar to ALU7_HUMAN ALU S 36.18 

428214 AA936282 Hs.120397 ESTs 36.10 

416908 AA333990 Hs.80424 coagulation factor XIII, A1 polypeptide 36.08 

15 426264 BE314852 Hs. 168694 hypothetical protein FU 10257 36.00 

415911 H08796 Hs.124952 ESTs 36.00 

457502 AA076049 Hs274415 Homo sapiens cDNA RJ1 0229 Rs t clone HE 3523 

421566 NM_000399 Hs.1395 earry growth response 2 (Krox-20 (Drosop 3520 

401468 34.89 

20 458561 AI220150 Hs.211195 ESTs 34.60 

433601 BE350738 Hs.123993 ESTs, Weakly simSar to T00366 hypotheti 3324 

454977 AW848032 gb:IL3-CTQ2 14-23 1299-053-D 11 CT0214Homo 32.96 

402828 32.93 

414522 AW518944 Hs.76325 Homo sapiens cDNA: FU23125 lis, clone L 31.76 

25 402842 31.6B 

421245 AA285363 gb:HTH280 HTCDL1 Homo sapiens cDNA 5*/? 3159 

401631 F05183 Hs.1799 CD1D antigen, d polypeptide 3126 

408057 AW139565 gb:UI-H-BI1-aea-d-04-0-Ul.s1 NCI_CGAP_Su 3124 

408069 H81795 gb:ys68a10.r1 Soares retina N2b4HR Homo 3120 

30 438694 T87479 Hs291797 ESTs 31.09 

449156 AF103907 Hs.17t353 prostate cancer antigen 3 29.78 

428796 AU076734 Hs. 193665 solute carrier family 28 (sodium-coupled 29.76 

452549 AI907039 gbPM-BTt 34-020499-566 BT134 Homo sapten 29.59 

410129 BE244074 Hs285531 regulator of Fas-induced apoptosis 29.53 

35 414464 AI870175 Hs.13957 ESTs 29.47 

412326 R07566 Hs.73817 Small inducible cytokine A3 (homologous 2922 

459081 W07808 gbzbQ3a12.r1 Soares_fetal_lung_NbHL19W 2920 

448702 AW102670 Hs.122464 ESTs 29.13 

451939 U80456 Hs27311 single-minded (Orosophlla) homolog 2 28.74 

40 443412 W84893 Hs.9305 angiotensin receptor-like 1 28.61 

457324 AB028990 Hs243901 KIAA1 067 protein 2B24 

424247 X14008 Hs234734 tysozyme (renal amyloidosis) 28.18 

457140 A1279960 Hs.178140 ESTs 28.12 

444151 AW972917 Hs.128749 alpha-methylacyWoA racemase 28.06 

45 457669 AW104257 Hs.123426 ESTs, Weakly similar to putative serine/ 27.61 

412429 AV650262 Hs.75765 GR02 oncogene 27.36 

405495 27.33 

406516 2725 

407997 AW135429 Hs243577 ESTs 26.96 

50 442115 AW452332 Hs257554 ESTs 2636 

409038 T97490 Hs.50002 small inducible cytokine subfamily A (Cy 2634 

402838 26.32 

449846 AI979284 Hs200552 ESTs * 2621 

417153 X57010 Hs31343 collagen, type II, alpha 1 (primary oste 2620 

55 439792 NM 014856 Hs.6684 KIAA0476 gene product 25.91 

450096 A1682088 Hs223368 ESTs 25.60 

424196 AL133660 Hs.1 42926 Homo sapiens mRNA; cDNA £)KFZp434M0927 (f 2557 

414246 BE391090 Hs280278 EST 2557 

420848 NMJJ05188 Hs.99980 Cas-Br-M (murine) ecotropic retroviral t 25.48 

60 424778 AA251048 Hs.153042 lymphocyte antigen 9 25.42 

409126 AA063426 gb.7f70c08.s1 Soaresj>ineal_gland_N3HPG 2525 

443936 AW083491 Hs.31196 ESTs 2522 

419392 W28573 gbSMO Human retina cDNA randomly prim 25.01 

411201 T74588 Hs.8509 ESTs, Weakly similar to C03_HUMAN COMPLE 2435 

65 422940 BE077458 gb^1-BT0606^9050W15-bO4 BT0606 Homo 24.76 

437571 AA760894 Hs.153023 ESTs 24.74 

433973 A1014723 Hs.131770 ESTs 2457 

422416 BE019557 Hs.11900 Human DNA sequence from clone RP4-583P15 2453 

421552 AF026692 Hs.105700 secreted frizzted-related protein 4 24.49 
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443868 U25758 Hs. 134584 ESTs 24.49 

424800 AL035588 Hs.153203 MyoD family Inhibitor 24.10 

453633 AA357001 Hs.34045 hypothetical protein FU20764 24.04 

430565 AL122081 Hs244343 cadherin related 23 24.00 

5 433694 AI208611 Hs.12066 Homo sapiens cONA FU1 1720 fis, done HE 23.89 

451045 AA215672 gb:zr96e09.s1 NCI CGAPJ5CB1 Homo sapiens 23.83 

408583 AW449674 Hs.47359 ESTs 23.73 

444040 AF204231 Hs.182982 golgin-67 23.62 

414182 AA1 36301 gb:zk93g04.s1 SoaresjjregnanLuterusJtoH 23.39 

10 418678 NM.001327 Hs.167379 cancet/testis antigen 2320 

408380 AF123050 Hs.44532 diubiquhin 22J68 

456076 BE243877 Hs.76941 ATPase, Na+/K+ transporting, beta 3 po!y 2255 

418299 AA279530 Hs.83968 integrin, beta 2 (antigen CD1 8 (p95),ly 22.38 

444917 R68651 Hs.144997 ESTs 2226 

15 444381 BE387335 Hs283713 ESTs 22.08 

415788 AW628686 Hs.78851 KIAA02 17 protein 22.04 

410896 AW809637 gb:MR4-ST0124-261099415-b07ST0124Horno 22.00 

412978 AM31708 Hs.820 homeoboxC6 2155 

458418 AV653846 Hs.126261 Homo sapiens Chromosome 16 BAC done CIT 21.94 

20 454791 BE071874 gb:RC2-BT0522-120200O14-a06BT0522Homo 21.84 

408748 J05500 Hs.47431 spectrin, beta, erythrocytic (includes s 2126 

416011 H14487 gb:ym18c10.r1 Soares infant brain 1NIB H 2124 

440474 AI207936 Hs.7195 gamma-aminobutyric add (GABA) A recepto 21.14 

447047 AI623698 Hs246306 Homo sapiens cDNA: FU23529 fis, done L 21.11 

25 426793 X89887 Hs.172350 HIR (histone eel) cyde regulation defec . 21.10 

409841 AW502139 gb:UI-HF-BR0p-ajr-e^5-(HJLr1 NIH_MGC_5 21.07 

405685 20.90 

457359 AI983207 Hs.1 92481 ESTs, Weakly similar to SYPHJHUMAN SYNAP 20.84 

423067 AA321355 Hs285401 ESTs 20.74 

30 422355 AW403724 Hs.140 immunoglobulin heavy constant gamma 3 (G 20.73 

401201 20.73 

458278 W28912 Hs.129019 ESTs 20.66 

439097 H66948 gb7r86dlO.1l Soares fetal Bver spleen 20.67 

414875 H42679 Hs.77522 major histocompatibility complex, dass 20.66 

35 400926 20.66 

451355 NNL004197 Hs.444 serine/threonine kinase 19 20.64 

446982 AW500221 Hs.43616 Homo sapiens mRNA for FLJ00029 protein, 20.61 

417105 X60992 Hs.81226 CD6 antigen 20.61 

405777 2051 

40 424123 AW966158 Hs.58582 Homo sapiens cONA RJ12702 fis, done NT 2020 

425009 X58288 Hs.154151 protein tyrosine phosphatase, receptor t 20.10 

443271 BE568568 Hs.195704 ESTs 19.98 

421064 AI245432 Hs.101382 tumor necrosis factor, alpha-induced pro 19.98 

418819 AA228776 Hs.191721 ESTs 19.94 

45 457595 AA584854 gkno09h11.s1 NCLCGAP_Phe1 Homo sapiens 19.90 

404426 19.84 

412571 U43143 Hs.74049 fms-relaled tyrosine kinase 4 19.79 

431457 NML012211 Hs256297 integrin, alpha 11 19.62 

414002 NMJJ06732 Hs.75678 FBJ murine osteosarcoma viral oncogene h 1957 

50 418994 AA296520 Hs.89546 Setectin E (endothelial adhesion molecul 1956 

437158 AW090198 Hs.4779 KIAA1 150 protein 1952 

437866 AA156781 Hs.83992 ESTs 19.44 

417421 AL138201 Hs.82120 nuclear receptor subfamily 4, group A, m - 19.34 

433057 X15675 Hs296832 Human pTR7 mRNA for repetitive sequence 1922 

55 421730 AW449808 Hs. 1 64036 glucosamine (N-acetyl)-6-sulfata$e (Sanf 1921 

456557 AA284477 Hs.96618 ESTs 18.77 

440806 AI247422 Hs.129966 ESTs 18.76 

439845 AL355743 Hs.56663 Homo sapiens EST from done 41214, full 18.65 

416155 AI807264 Hs205442 ESTs, Weakly similar to AF1 17610 1 inner 18.64 

60 437820 AA769062 Hs.16029 ESTs, Weakly similar to alternatively sp 18.62 

450923 AW043951 Hs.38449 ESTs 1859 

418329 AW247430 Hs.84152 cystathionine-beia-synthase 1858 

424537 AI673027 Hs.143271 ESTs 1855 

447742 AF1 13925 Hs.19405 caspase recruitment domain 4 1852 

65 415251 R42863 Hs.7124 ESTs 18.47 

440770 AA912815 Hs222078 ESTs 18.40 

407711 AI085846 Hs25522 ESTs 18.32 

427157 U51166 Hs.173824 thymine-DNA gtycosyiase 1828 

409847 AW501751 Hs279733 ESTs 18.15 
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417240 N57568 Hs.176028 EST 1B.13 

435732 AF229178 Hs.123136 leucine rich repeat and death domain con 18.12 

438898 AW977385 Hs278615 ESTs 18.12 

432485 N9086S Hs276770 CDW52 antigen {CAMPATH-1 antigen) 17.90 

5 429490 A1971131 Hs293684 ESTs, Weakly similar to alternatively sp 17.82 

429984 AL050102 Hs227209 DKFZP586F1 019 protein 17.82 

449214 AI889114 Hs.195663 ESTs 17.75 

433867 AK000596 Hs.3818 hippocalcin-like 1 17.72 

431735 AW977724 Hs.75968 thymosin, beta 4, X chromosome 1771 

10 401515 17.67 

444045 AI097439 Hs.135548 ESTs 1758 

442754 AL045825 Hs.210197 ESTs 1755 

426559 AB001914 Hs.170414 paired basic amino acid cleaving system 1754 

432415 T16971 HS289014 ESTs 1750 

15 427829 AI188225 Hs.127462 ESTs 1750 

432516 R08003 Hs.188013 ESTs 17.44 

435259 M152106 Hs.4859 cyclin L ania-6a 173 

414989 T81668 gb:yd29cQ4.r1 Soarestetairrver spleen 1731 

444880 AW1 18683 Hs.154150 ESTs 1730 

20 417651 R06874 Hs268628 ESTs 17.27 

453457 AL037103 Hs.270599 ESTs, Weakly similar to unnamed protein 1722 

424246 AW452533 Hs.143604 Kaiso 1752 

419078 M93119 Hs.89584 insulinoma-associated 1 17.18 

417696 BE241624 Hs.82401 CD69 antigen (p60, early T-cell acuvati 17.14 

25 431117 AF003522 Hs250500 delta (Drosophila)-tike 1 17.14 

455254 AW877015 gb:QV^-PT0010-25030(H)96-f12PT0010Homo 17.14 

425782 U66468 Hs. 159525 cell growth regulatory with EF*handdoma 17.12 

426678 H08170 Hs.1 13755 ESTs 17.12 

426403 NM_000361 Hs2030 thrombomodulin 17.01 

30 425905 AB032959 Hs.1 61 700 KIAA1 133 protein . 17.00 

438867 AW451157 Hs.181157 ESTs 16.98 

420940 AA830664 Hs.143974 ESTs 16.94 

459234 AI940425 p>:CMMT0052-150799^4-c04 CT0052 Homo 16.92 

404756 16.91 

35 422247 U18244 . Hs.1 13602 solute carrier family 1 (high affinity a 1650 

420568 F09247 Hs.167399 protocadherin alpha 5 1658 

443559 A1076765 Hs269899 ESTs 1650 

438703 AI803373 Hs.31599 ESTs 16.78 

411424 AW845985 gbflC2-CT0163-200999-002-H08CT0163Homo 16.70 

40 402895 1659 

422538 NM_006441 Hs.118131 5,1 O-methenyitetrahydrofolate synthetase 16.68 

447108 AW449602 Hs.217953 ESTs, Moderately similar to NK-TUMOR REC 16.65 

448520 AB002367 Hs21355 doublecortin and CaM kinase-like 1 1654 

438567 AW451955 Hs.153065 ESTs 1652 

45 407811 AW190902 Hs.40098 cysteine knot superfam3y 1, BMP antagon 16.50 

410721 R23534 Hs2730 heterogeneous nuclear ribonucteoprotein 1650 

437133 AB018319 Hs.5460 KIAA0776 protein 16.40 

408182 AA047854 gb:zf49g04j1 Soares retina N2b4HR Homo 1632 

417315 AI080042 Hs.1 80450 ribosoma! protein S24 1630 

50 431640 AA534908 Hs2860 POU domain, dass 5, transcription facto 1628 

439882 AA847856 Hs.124565 ESTs 1620 

418277 AW135221 Hs.130812 ESTs 16.09 
410688 AW796342 p>:PM2-UM0027-23020XH)02-h02 UM0027 Homo * 16.04 

420120 AL049610 Hs.95243 transcription elongation factor A (Sll> 16.04 

55 429597 NM_003816 Hs2442 a disintegrin and metalloproteinase doma 16.02 

447033 AI357412 Hs.157601 EST - not in UniGene 16.02 

421684 BE281591 Hs.106768 hypothetical protein FU10511 15.94 

408599 AA055800 Hs222933 ESTs 15.93 

446012 AV656098 Hs. 172382 hypothetical protein FU20001 15.86 

60 409671 AA076769 gb:7B02B10 Chromosome 7 Fetal Brain cDNA 1535 
405934 • I 5 - 84 

426108 AA622037 Hs.166468 programmed cell death 5 15.84 

416208 AW291168 Hs.41285 ESTs 15.48 

410708 AA534370 Hs.1 54088 Homo sapiens cDNA: FU22756 fis, clone K 15.42 

65 447342 AI199268 Hs.19322 ESTs; Weakly similar to III! ALU SUBFAMI 1538 

454563 AW807530 gb."CMO-ST008M30999-054^02 ST0081 Homo 1537 

411507 AW850140 gb:lL3-CT021 9-261 099-023-D1 1 CT02 19 Homo 1536 

438170 AI916685 Hs.194601 ESTs 1529 

416292 AA1 79233 Hs.42390 nasopharyngeal carcinoma susceptibility 1526 
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406638 M13861 gfcHuman T-cell receptor active beta-cha 1526 

446686 AW138043 Hs.156307 ESTs 1555 

434485 AI623511 Hs.118567 ESTs 1524 

441188 AW292830 Hs255609 ESTs 1522 

5 444172 BE147740 Hs.104558 ESTs 1522 

409521 BE244854 Hs.159578 Homo sapiens mRNA for FU00020 protein, 15.16 

420748 AA279956 Hs.88672 ESTs 15.14 

422583 AA410506 Hs.1 18578 H^apiens mRNA for ribosoma) protein L18 15.14 

424240 AB023185 Hs.143535 calcium/calmodulirxiependent protein kin 15.12 

10 451118 A1862096 Hs.60640 ESTs 15.12 

437495 BE177778 gbflC1-m-0598-31030XH)l2-H)7HT0598Ho{no 15.12 

445467 AI239832 Hs.15617 ESTs, Weakly similar to ALU4_HUMAN ALU S 15.06 

418305 AW006783 Hs.6686 ESTs 15.03 

402812 15.02 

15 436851 AA732480 Hs293581 ESTs 15.00 

400991 15.00 

415752 BE314524 Hs.78776 Human putative transmembrane protein (nm 14.96 

429900 AA460421 Hs.30875 ESTs 14.90 

403683 14.84 

20 430315 NMJJ04293 Hs239147 guanine deaminase 1430 

451952 AL120173 Hs.301663 ESTs 14.72 

424687 J05070 Hs.151738 matrix metaiioproteinase 9 (getatinase B 14.69 

447229 BE617135 gb:601441677F1 NIHJAGC_65 Homo sapiens c 14.67 

425818 AB021225 Hs.159581 matrix metaiioproteinase 17 (membrane-in 14.65 

25 448553 AI638449 Hs.173031 ESTs 14.63 

431089 BE041395 Hs283676 ESTs, Weakly simter to unknown protein 14.60 

459145 AI903354 gb:RC-BT029-100199-117BT029Homosapien 14.55 

449650 AF055575 Hs297647 ESTs, Moderately similar to calcium chan 14.54 

400952 14.46 

30 445885 AI734009 Hs.127699 EST cluster (not in UniGene) 14.44 

407938 AA905097 Hs.85050 phospholamban 14.42 

431676 AI685464 Hs292638 ESTs 14.40 

437210 AA311443 Hs293563 Homo sapiens mRNA; cDNA DKF2p586E2317 (f 14.36 

451900 AB023199 Hs27207 . KIAA0982 protein 14.36 

35 445800 AA126419 Hs.301632 ESTs 14.32 

412368 AW945992 Hs.181125 immunoglobulin lambda locus 14.31 

409055 AW304028 Hs.300578 ESTs 1423 

408763 W57550 Hs.301526 Homo sapiens cONA RJ13181 fis, clone NT 1422 

446734 . AL049278 Hs.16074 Homo sapiens mRNA; cONA DKFZp564l153 (fr 1422 

40 413551 BE242639 Hs.75425 ubiquitin associated protein 1422 

421913 AI934365 Hs.1 09439 osteoglydn (osteoinductive factor, mims 1422 

452712 AW838616 gb:RC54J0054-140200413-DQ1 LT0054 Homo 1422 

451468 AW503398 Hs210047 ESTs 14.16 

406038 Y14443 Hs.88219 zinc finger protein 200 14.14 

45 424909 S78187 Hs.153752 cell division cycle 25B 14.07 

434078 AW880709 Hs283683 EST 14.07 

415254 AI815831 Hs.184378 ESTs 14.05 

418196 AI745649 Hs26549 ESTs, Weakly similar to T00066 hypotheti 14.02 

410020 T86315 Hs.728 ribonudease, RNase A family, 2 (liver, 13.98 

50 411352 NM.002890 Hs.758 RASp21 protein activator (GTPase activa 13.98 

429848 AF145439 Hs225946 chemokine (C-C motif) receptor 9 13.95 

413729 BE159999 gb:QV1-HT0412-27£O0r>123-d10HT0412Homo 13.90 

400125 * 13.88 

420319 AW406289 Hs.96593 hypothetical protein 13.85 

55 448272 AI479094 Hs.170786 ESTs 13.80 

422695 AA315158 gb:EST186956HCC cell line (matastasis t 13.80 

424565 AW102723 Hs.75295 guanylate cyclase 1, soluble, alpha 3 13.78 

458048 H30340 Hs.1 73705 Homo sapiens cONA: FLJ22050 fis, clone H 13.78 

408894 AI935400 Hs217286 ESTs 13.76 

60 454093 AW860158 gbflCf>CT0379-2901(XH)32-b04 CT0379 Homo 13.75 

410889 X91662 Hs.66744 twist (Drosophila) homolog (acrocephabs 13.74 

457751 AI908236 gb:IL-BT166-180399-010BT166 Homosapien 13.72 

455131 AW857913 p^:RCf>CT0323-231199-03t-b05CT0323Horno 13.69 

408364 AW015238 Hs.128453 ESTs 13.67 

65 425907 AA365752 Hs.155965 ESTs U62 

402359 13.60 

401044 1353 

409877 AW5Q2498 Hs.157150 ESTs, Weakly simBar to zinc finger prot 1353 

423690 AA326648 Hs23804 ESTs 13.49 
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430685 AI69Q234 Hs.191666 ESTs, Weakly similar to reverse transcri 13.47 

414052 AW578849 Hs283552 ESTs, Weakly similar to unnamed protein 13.46 

447858 AW080339 Hs211911 ESTs 13.44 

435716 AI573283 Hs.38458 ESTs 13.44 

5 439120 H56389 Qb.7t87c03.r1 Soares_pineaLgknd_N3HPG 13.43 

402788 13.40 

451591 AA886446 Hs.146278 ESTs 13.40 

405411 13.38 

426558 AW188574 Hs24218 ESTs 13.34 

10 453506 AA132818 Hs.110407 ESTs, Weakly similar to coded for by C. 13.33 

416445 ALO43004 Hs.300678 Human serine/threonine kinase mRNA, part 13.32 

457084 AI074149 Hs.150905 ESTs, Weakly similar to chondroitin 4-su 13.32 

403838 13.32 

427337 Z46223 Hs. 1 76663 Fc fragment of IgG, low affinity II lb, r 13.30 

15 434318 AW207552 Hs.1 16328 ESTs, Weakly similar to dJ134E15.1 [H.sa 13.28 

435193 N41359 Hs218107 ESTs 1328 

414756 AW451101 Hs.159489 ESTs, Moderately similar to hexokinase I 1357 

420626 AF043722 Hs.99491 RAS guanyt releasing protein 2 (calcium 1326 

420052 AA418850 Hs.44410 ESTs 1325 

20 414020 NM_002984 Hs.75703 small Inducible cytokine A4 (homologous 1325 

403851 1324 

422647 W07492 Hs.157101 ESTs 1321 

433598 AI762836 Hs271433 ESTs, Moderately similar to ALU2_HUMAN A 1321 

409065 AB033113 Hs.50187 KIAA1287 protein 1320 

25 435063 R21966 Hs.57734 G protein-coupled receptor kinase-intera 13.19 

439367 BE386844 Hs248746 ESTs 13.17 

451957 AI796320 Hs.10299 Homo sapiens cDNA FU13545 fis, clone PL 13.16 

420569 AA278362 Hs289062 Homo sapiens cDNA RJ 12334 fis, clone MA 13.14 

447883 BE262802 Hs.4909 dickkopf (Xenopus taevis) homotog 3 13.07 

30 426490 NMJW1621 Hs.1 70087 aryl hydrocarbon receptor 13.06 

414789 AA155859 Hs.79708 ESTs 13.05 

451418 BE387790 Hs26369 ESTs 13.04 

443494 T99719 Hs270404 Homo sapiens cDNA: FU22389 fis, done H 13.03 

425878 AW964806 Hs.38085 ESTs, Weakly similar to putative glycine 13.02 

35 431912 AI660552 Hs. 154903 ESTs, Weakly similar to A561 54 Ablsubst 13.00 

407122 H20276 Hs.31742 ESTs 13.00 

456491 AL137466 Hs57277 Homo sapiens mRNA; cON A DKFZp434H 1 322 (f 12.99 

448172 N75276 Hs.135904 ESTs 12.98 

452144 AA032197 Hs.102558 ESTs 1256 

40 419953 BE267154 Hs.125752 ESTs 12.96 

416182 NM_004354 Hs.79069 cycnnG2 12.94 

451154 AA015879 Hs.33536 ESTs 12.93 

412257 AW903830 gb:CM4-NN1 037-250400-1 55-hQ4 NN1 037 Homo 1253 

449784 AW161319 Hs.12915 ESTs 1252 

45 432695 D63480 Hs278634 KIAA0146 protein 1252 

454105 NMJXH259 Hs.38481 cycTm-dependent kinase 6 12.92 

439093 AA534163 Hs.5476 serine protease inhibitor, Kazal type, 5 1250 

416098 H41324 Hs31581 ESTs, Moderately similar to ST1B_HUMAN S 12.88 

424897 D63216 Hs.153684 frizzfed-related protein 12.88 

50 414604 AU076649 Hs.76556 growth arrest and DNA-damage-inducible 3 12.88 

414664 AA587775 Hs.66295 Homo sapiens HSPC31 1 mRNA, partial cds 12.84 

452560 BE077084 gb^C5-BT0603-22020(M)13<X)7BT0603Homo 12.84 

413869 NMJW0878 Hs.75596 interieukin 2 receptor, beta - 12.60 

452359 BE167229 Hs29206 Homo sapiens clone 24659 mRNA sequence 12.80 

55 435886 BE265839 Hs.12126 hepatocellular carcinorna-associated anti 12.78 

445230 U97018 Hs.12451 echinoderm microtubule-associated protei 12.78 

412226 W26786 gb:15d7 Human retina cDNA randomly prime 12.77 

446619 AU076543 Hs.313 secreted phosphoprotein 1 (osteopontin, 12.76 

447769 AW873704 Hs.48764 ESTs 12.76 

60 414478 AI306389 Hs.76240 adenylate kinase 1 12.76 

425383 D83407 Hs.156007 Down syndrome critical region gene 1-iik 12.68 

450704 H85157 Hs.40696 ESTs 12.66 

405856 12-66 

412935 BE267045 Hs.75064 tubuBn-specific chaperone c 12.65 

65 4028Q2 12.62 

452588 AA889120 Hs.110637 HomeoboxAlO 12.62 

419978 NM.001454 Hs53974 forkheadboxJI 12.62 

403137 12.60 

430226 BE245562 Hs2551 adrenergic, beta-2-, receptor, surface 1237 
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446076 AJ133123 Hs20196 adenylate cyclase 9 1256 

450462 F07097 Hs.300828 Homo sapiens mRNA full tongth insert cON 1254 

405236 1252 

409292 AA071051 gb:zm58e05.s1 Stratagene fibroblast (937 12.47 

5 421540 AA767669 Hs.10242 ESTs 12.47 

425840 AW978731 Hs.301824 ESTs 12.44 

443181 AI039201 Hs.54548 ESTs 12.42 

452436 BE077546 Hs.31447 ESTs 12.42 

455183 AW984111 gbflC0-HN(X)07.16(raW11.{09HN(X)07Homo 12.40 

10 432887 AI926047 Hs.162859 ESTs 12.37 

410494 M36564 Hs.64016 protein S (atpha) 12.36 

439024 R96696 Hs.35598 ESTs 12.36 

451246 AW189232 Hs.39140 cutaneous T-ce II lymphoma tumor antigen 12.36 

432892 AL042615 Hs.15995 ESTs 12.35 

15 418982 AI348838 Hs.13073 ESTs 12.35 

414516 AI307802 Hs279551 ESTs 12.34 

440134 BE410734 gb:601301619F1 NIH_MGC_21 Homo sapiens c 1229 

443873 AL048542 Hs.16291 ESTs 1228 

401286 1226 

20 454020 AW962845 Hs256527 ESTs 1224 

420077 AW512260 Hs.87767 ESTs 1224 

443837 AI984625 Hs.9884 spindle pole body protein 1224 

407519 X64979 gb:H.sap!ens mRNA HTPCRX01 for olfactory 1223 

435839 AF249744 Hs25951 Rho guanine nucleotide exchange factor ( 1222 

25 448552 AW973653 Hs20104 'hypothetical protein FU00052 1220 

405325 1220 

451009 AA013140 Hs.1 15707 ESTs 12.18 

423066 Y18264 Hs.120171 ESTs 12.17 

439556 AI623752 Hs.163603 ESTs 12.16 

30 443062 N77999 Hs.8963 Homo sapiens mRNA full length insert cDN 12.15 

445873 AA250970 Hs.251946 Homo sapiens cDNA: FU23107 fts, clone L 12.14 

453542 AW836724 Hs.33190 Homo sapiens mRNA expressed only in plac 12.11 

440106 AA864968 Hs.127699 ESTs 12.10 

417605 AF006609 Hs.82294 regulator of G-protein signaling 3 12.10 

35 440266 U29589 Hs7138 chonnergic receptor, muscarinic 3 12.04 

420061 AW024937 Hs29410 ESTs 12.02 

458727 AIQ22813 Hs.92679 Homo sapiens clone CDABP001 4 mRNA sequen 11.96 

445407 A1222658 Hs221889 ESTs, Weakly similar to la costa [D.mela 11.95 

418250 U29926 Hs.83918 adenosine monophosphate deaminase (Jsofo 11.94 

40 414129 A1990287 Hs270798 ESTs 11.93 

409799 D11928 Hs.76845 phosphoserme phosphatase-Dke 11.92 

438461 AW075485 Hs286049 phosphoserine aminotransferase 11.92 

443912 R37257 Hs.184780 ESTs 11.92 

424606 AA343936 gb:EST49786 Gall bladder I Homo sapiens 11.90 

45 434217 AW014795 Hs23349 ESTs 11.90 

451533 NM_004657 Hs26530 serum deprivation response (phosphatidyl 11.90 

422423 AF283777 Hs.1 16481 CD72 antigen 11.89 

409398 AW386461 gbJ>M4-PT0019-12129^004-F02 PT0019Homo 11,89 

423853 AB011537 Hs.133466 slit (Drosophila) homobg 1 11.82 

50 446180 AI074413 Hs.14220 hypothetical protein FU20450 11.80 

414341 D80004 Hs.75909 KIAA01 82 protein 11.80 

406538 11.79 

433253 AW450502 Hs24218 ESTs - 11.79 

447397 BE247676 Hs.18442 E-1 enzyme 11.78 

55 451684 AF216751 Hs26813 COA14 11.76 

416862 R23765 Hs23575 ESTs 11.74 

425770 NM_014363 Hs.159492 spastic ataxia of Charlevoix-Saguenay (s 11.72 

428826 AL048842 Hs.194019 attractin 11.72 

433037 NMJH4158 Hs279938 HSPC067 protein 11.72 

60 447476 BE293466 Hs20880 ESTs 11.72 

452092 BE245374 Hs27842 hypothetical protein HJ 11 210 11.72 

412922 M60721 Hs.74870 H2.0 (DrosophOaHDce homeo box 1 11.72 

401680 NM_005578 Hs.180398 UM dornain-containing prefened transloc 11.69 

422576 BE548555 Hs.118554 CGW3 protein 11.68 

65 450203 AF097994 Hs301528 L-kynurenme/a^ha-anTmoad^teajrmotra 11.68 

410531 AW752953 gb:QVOCT0224-261099K)35^02CT0224Homo 11.67 

425917 W28517 Hs.1 17167 Homo sapiens cONA: FU23067 fis, clone L 11 .66 

418693 AI750878 Hs.87409 thrombosponcfin 1 11 £4 

400557 11.62 
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416188 BE157260 Hs.79070 v-myc avian myelocytomatosis viral oncog 1150 

419047 AW952771 Hs.90043 ESTs 1159 

420441 AI986160 Hs.88446 ESTs 1159 

400885 1157 

5 409853 AW502327 gb:UWF-BR0p-aka-a-07-0-UJ.r1 NIH_MGC_5 1156 

400802 1156 

434540 NM.016O45 Hs5184 TH1 drosophila homolog 1155 

431449 M55994 Hs256278 tumor necrosis fador receptor superfaml 1155 

425928 S55736 Hs238852 ESTs, Weakly similar to hypothetical pro 1154 

10 434701 M460479 Hs.4096 KIAA0742 protein 1153 

434228 Z42047 Hs283978 ESTs; KIAA0738 gene product 1152 

420729 AW964897 Hs29Q825 ESTs • 1152 

428328 AA426080 Hs.98489 ESTs 1150 

433887 AW204232 Hs279522 ESTs 1150 

15 414812 X72755 Hs.77367 monokine induced by gamma Interferon 11.46 

457718 F18572 Hs22978 ESTs 11.44 

452260 AA453208 Hs28726 RAB9, member RAS oncogene family 11.42 

459029 AA131376 Hs285203 fibroblast growth factor 12 11.42 

456267 AI127958 Hs.83393 cystatinE/M 1159 

20 433285 AW975944 Hs237396 ESTs 1158 

449186 AW291876 Hs.196986 ESTs 11.37 

447861 AI434593 Hs.164294 ESTs 11.37 

456023 R00028 gb:ye70a06.s1 Soares fetal fiver spleen 11.36 

439444 AI277652 Hs54578 ESTs 11.31 

25 401163 1131 

430886 L36149 Hs248116 chemokine (C motif) XC receptor 1 11.28 

450784 AW246803 Hs.47289 ESTs 1128 

452391 AL044829 Hs29331 carnitine palmitoyltransferase I, muscle 1127 

449625 NMJJ14253 Hs23796 odz (odd OzAen-m, Drosophila) homolog 1 1126 

30 456827 AA075687 Hs.147176 epidermal growth factor receptor substra 1124 

439328 W07411 Hs.118212 ESTs, Moderately similar to ALU3_HUMAN A 1124 

432093 H28383 gb.7l52c03.r1 Soares breast 3NbHBst Homo 1124 

407335 AA631047 Hs.158761 Homo sapiens cDNA FU13054 fis, done NT 1123 

442501 AA315267 Hs23128 ESTs 1122 

35 429746 AJ237672 Hs21 4 142 5,1 0-methylenetetrahydrofotate reductase 1121 

422858 R35398 gb:yg64g10.r1 Soares infant brain 1 NIB H 1120 

415156 X84908 Hs78060 phosphoryiase kinase, beta 1120 

446713 AV660122 Hs282675 ESTs 1120 

452221 C21322 Hs.11577 ESTs 1120 

40 418261 W78902 Hs293297 ESTs 11.17 

433332 AI367347 Hs.127809 ESTs 11.16 

434539 AW748078 Hs214410 ESTs 11.16 

413471 BE142098 gb:CM4-HT0137-220999-017-d11 HT0137Homo 11.14 

410037 AB020725 Hs.58009 KIAA09 18 protein 11.14 

45 405601 11.13 

458332 AI000341 Hs220491 ESTs 11.12 

427654 AA410183 Hs.137475 ESTs 11.12 

427138 N77624 Hs.173717 phosphatide add phosphatase type 2B 11.10 

431475 AI567669 Hs287316 ESTs 11.10 

50 425710 AF030880 Hs.159275 solute carrier family, member 4 11.08 

413748 AW104057 Hs.19193 ESTs 11.07 

409208 Y00093 Hs51077 integ rin. alpha X (antigen CD11C(p 150), 11.07 

457278 W92745 Hs.193324 ESTs - 11.03 

407021 U52077 gbiHuman marinerl transposase gene, comp 11.02 

55 445701 AF055581 Hs.13131 lymphocyte adaptor protein 11.02 

408338 AW867079 gb^R1-SN(X)33-12O40OO02<10SN0033Homo 1055 

401030 BE382701 Hs25960 v-myc avian myelocytomatosis viral relat 1055 

437891 AW006969 Hs.6311 hypothetical protein FU20859 1054 

453874 AW591783 Hs56131 collagen, type XIV, alpha 1 (undutin) 1054 

60 421562 AA530994 Hs.105803 ghrelin precursor 10.92 

413431 AW246428 Hs.75355 ubiquitin-conjugating enryme E2N (homolo 10.92 

400132 1052 

436420 AA443966 Hs.31595 ESTs 1050 

424880 NMJXXB28 Hs.153614 retinitis pigmentosa GTPase regulator 10.88 

65 433264 D85782 Hs.3229 cysteine dioxygenase, type I 1058 

429842 AI366213 Hs.173422 KIAA1 605 protein 1057 

412406 AW948126 gb:RCO-WT0013-28030(H>31-a12 MT0013Homo 1055 

400615 1050 

425018 BE245277 Hs.154196 E4F transcription factor 1 1050 
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456011 BE243628 gb:TCBAP1D1053 Pediatric pre-B cell ecut 10.79 

455982 BE176862 gb^C4-HT0587-17030(H)12-a04 HT0587 Homo 10.74 

450418 BE218418 HS2018Q2 ESTs 10.73 

412490 AW803564 Hs.288850 ESTs 10.72 

5 436962 AW377314 Hs5364 OKFZP564I052 protein 10.70 

437743 AI383497 Hs.131811 ESTs, Weakly similar to ALU 1 .HUMAN ALU S 10.70 

449967 R40978 Hs271498 ESTs, Moderately similar to ALU INHUMAN A 10.70 

449590 AA694070 Hs.268835 ESTs 10.68 

446035 NM_006558 Hs.13565 Sam68-like phosphotyrosine protein, T-ST 10.68 

10 426530 U24578 Hs.170250 complement component 4A 10.66 

428600 AW863261 Hs.15036 ESTs, Highly similar to AF1 61358 1 HSPCO 10.64 

420090 AA220238 Hs.94986 ribonuclease P (38kD) 10.64 

451593 AF151879 Hs.26706 CGI-1 21 protein 10.62 

438893 AF075031 Hs29327 ESTs 10.62 

1 5 459324 AW080953 gb'Jtc2Bc12j(1 NCLCGAP_Co18 Homo sapiens 10.61 

439883 AL359652 Hs.171096 Homo sapiens EST from clone DKFZp434A041 1058 

406513 AA715328 Hs2912Q5 ESTs 1057 

407826 AA12B423 Hs.40300 calpain 3, (p94) 1057 

419550 D50918 Hs.90998 KIAA01 28 protein; septin 2 1056 

20 428522 R10184 Hs.191987 ESTs, Weakly similar to ALU 1.HUMAN ALUS 1056 

459526 AI142350 Hs.146735 EST 1055 

411448 AA178955 Hs271439 ESTs 10.54 

410102 AW248508 HS279727 ESTs; 1052 

406577 1052 

25 408405 AK001332 Hs.44672 hypothetical protein RJ1 0470 10.51 

428966 AF059214 Hs.194687 cholesterol 25-hydroxylase 1050 

400880 10.48 

415875 AA894876 Hs.5687 protein phosphatase IB (formerly 2C), ma 10.48 

434715 BE005346 Hs.116410 ESTs 10.46 

30 406851 AA609784 Hs.180255 major histocompatibility complex, class 10.44 

413409 A1638418 Hs21745 ESTs 10.44 

418489 U76421 Hs.85302 adenosine deaminase, RNA-specific, B1 (h 10.44 

419465 AW500239 Hs.21187 Homo sapiens cONA: FU23068 lis, clone L 10.44 

419544 A1909154 gb:QV-BT20(H)10499-007BT200Homosapien 10.44 

35 432180 Y18418 Hs272822 RuvB (E ooli homobg)-like 1 10.44 

413822 R08950 Hs272044 ESTs, WeaWy similar to ALU 1_HUMAN ALU S 10.42 

437446 AA788946 Hs.16869 ESTs, Moderately similar to CA1C RAT COL 10.41 

415701 NM 003878 Hs.78619 gamma-glutamyl hydrolase (conjugase, fol 10.41 

443790 NMJXJ3500 Hs.9795 acyl-Coenzyme A oxidase 2, branched chai 10.40 

40 458873 AW150717 Hs296176 STAT induced STAT inhibitor 3 10.38 

415082 AA160000 Hs.137396 ESTs 10.37 

429124 AW505086 Hs.196914 minor histocompatibility antigen HA-1 10.36 

417187 AB011151 Hs.81505 KIAA0579 protein 10.34 

426827 AW067805 Hs.172665 methyteratetrahydrofolate dehydrogenase 10.34 

45 424280 NMJXXXBO Hs271366 alanine-glyoxyiate aminotransferase homo 1033 

446099 T93096 Hs.17126 ESTs 10.32 

423445 NMJM4324 Hs.128749 aJpha-methyiacyi-CoA racemase 1031 

409995 AW960597 Hs.30164 ESTs 10.30 

432242 AW022715 Hs.162160 ESTs, Weakly similar to ALU4.HUMAN ALU S 1050 

50 406394 AA172106 Hs.1 10950 Rag C protein 1030 

406189 1029 

422283 AW411307 Hs.114311 CDC45 (cell division cycle 45, S.cerevis 1026 

401598 AA172106 Hs.110950 RagC protein - 1026 

456995 T89832 Hs.1 70278 ESTs 1026 

55 416511 NM_006762 Hs.79356 Lysosomal-associated mullispanning membr 1024 

427274 NM_005211 Hs.174142 colony stimulating factor 1 receptor, fo 1024 

401384 1023 

456226 D13168 Hs.82002 endothelin receptor type B 1022 

426928 AF037062 Hs.172914 retinol dehydrogenase 5 (1 1-cisand 9-cis 1021 

60 423032 AI684746 Hs.1 19274 ESTs 1020 

436556 AI364997 Hs.7572 ESTs 1020 

418400 BE243Q26 Hs.301989 K1AA0246 protein 10.19 

437401 AA757196 Hs.121190 ESTs 10.19 

403690 10.17 

65 423790 BE152393 gb«M2-HT0323<171199-033-a08HT0323Homo 10.16 

434094 AA305599 Hs238205 hypothetical protein PRO201 3 10.16 

434967 AW975009 Hs292274 ESTs 10.16 

432827 Z68128 Hs.3109 Rho GTPase activating protein 4 10.16 

432660 AI288430 Hs.64004 ESTs 10.14 
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452234 AW084176 Hs.223296 ESTs 10.14 

445629 AI245701 gb:qk31 (05 Jtl NCLCGAPJGd3 Homo sapiens 10.13 

457236 AA626142 Hs.179991 ESTs, Weakly similar to KPCE_HUMAN PROTE 10.13 

444605 M174603 Hs.254105 enolase 1 , (alpha) 10.12 

5 450313 A1038989 Hs.24809 hypothetical protein FLJ10826 10.12 

407482 NM 006056 10.12 

449971 AA807346 Hs.288581 Homo sapiens cDNA FU14296 fis, done PL 10.11 

441201 AW1 18822 Hs.128757 ESTs 10.10 

435157 AW014605 Hs.179872 ESTs 10.10 

10 417308 H60720 Hs.81892 KIAA01 01 gene product 10.09 

442582 AI204266 Hs.179303 ESTs 10.05 

437252 AI433833 Hs.164159 ESTs, Weakly similar to ALU1.HUMAN ALU S 10.04 

448663 6E614599 Ks.106823 H.sapiens gene from PAC 426I6, similar t 10.04 

434467 BE552368 Hs.231 853 Homo sapiens cDNA FU1 3445 tis, clone PL 10.04 

15 423698 AA329796 Hs.1098 DKFZp434J1 813 protein 10.02 

412707 AW206373 Hs.16443 Homo sapiens cDNA: RJ21721 fis, done C 10.00 

414658 X58528 Hs.76781 ATP-bnnding cassette, sub-family D (ALD) 10.00 

421832 NMJJ16098 Hs.108725 HSPC040 protein 10.00 

423554 M9Q516 Hs.1674 glutamine-midose-6-phosphatetransamin 10.00 

20 452039 AI922988 Hs.172510 ESTs 10.00 . 

434673 AW137442 Hs.136965 ESTs 10.00 

427978 AA418280 Hs.180040 Homo sapiens cDNA: FU22439 fis, done H 10.00 

457803 BE501815 Hs.198011 ESTs 9.99 

428279 AA425310 Hs.155766 ESTs 9.98 

25 444412 AI147652 Hs.216381 Homo sapiens clone HH409 unknown mRNA 9.98 

417049 N72394 Hs.44862 ESTs 9.96 

427509 M62505 Hs2161 complement component 5 receptor 1 (C5a I 9.96 

445424 AB028945 Hs.12696 cortactin SH3 domain-binding protein 9.96 

443678 AW009605 Hs.231923 ESTs 9.96 

30 447567 AW474513 Hs.224397 ESTs, Weakry similar to B4801 3 proline-r 9.94 

414709 AA704703 Hs.77031 Sp2 transcription factor 9.94 

434596 T59538 gb:yb65g12.s1 Stratagene ovary (937217) 9.94 

427630 BE276115 Hs.144980 ESTs, Weakry similar to CA13_HUMAN COLLA 9.93 

416111 AA033813 Hs.79018 chromatin assembly factor 1 , subunit A ( 9.92 

35 423349 AF010258 Hs.127428 homeoboxA9 9.92 

424308 AW975531 Hs.1 54443 minichromosome maintenance defident($. 9.92 

416814 AW192307 Hs.80042 doBchyl-P-Glc:Man9GlcNAc2-PP-dolichylgl 9.90 

417886 AA481003 Hs.97128 ESTs 9.90 

425174 D87450 Hs.154978 KJAA0261 protein 9.90 

40 438171 AW976507 Hs.293515 ESTs 9.90 

421984 AW972187 Hs.1 10443 hypothetical protein FU2221 5 9.89 

408597 NM 005291 Hs.46453 G protein-coupled receptor 17 9.88 

413907 AI097570 Hs.71222 ESTs 9.87 

451296 AW801383 Hs.118578 H.sapiens mRNA for ribosomal protein L18 9.86 

45 433409 At278802 HS25661 ESTs 9.85 

450360 AW1 17416 Hs245484 ESTs 9.85 

433104 AL043002 Hs.128246 ESTs, Moderately similar to unnamed prot 9.84 

449824 A1962552 Hs.226765 ESTs 9.84 

452744 AI267652 Hs.30504 Homo sapiens mRNA; cDNA DKFZp434E082 (fr 9.82 

50 431066 AF026273 Hs^49175 interteukin-1 receptor-assodated kinase 9.82 

426457 AW894667 Hs.1 69965 chimerfri (chimaerin) 1 9.80 

443371 AI792888 Hs.1 45489 ESTs 9.80 

437159 AL050072 gfcHomo sapiens mRNA; cDNA DKFZp566E 1346 - 9.75 

425242 013635 Hs.155287 KIAA0010 gene product 9.74 

55 447498 N67619 Hs.43687 ESTs 9.74 

426759 AI590401 Hs.21213 ESTs 9.73 

435129 AI381659 Hs.267086 ESTs 9.72 

437672 AW748265 Hs.5741 flavohemoprotein b54b5R 9.72 

438209 AL120659 Hs.6111 KIAAO307 gene produd 9.72 

60 438440 AA807228 Hs£25161 ESTs 9.72 

449720 AA311152 Hs.288708 ESTs; WeaWy simBar to KIAA0226 [H.sapi 9.72 

414291 AI289619 Hs.13040 ESTs ' 9.72 

436206 AK001451 Hs.265561 CD2-associated protein 9.70 

446896 T15767 Hs22452 Homo sapiens cDNA: FU21084 fis. done C 9.70 

65 412667 AW977540 Hs.269254 ESTs 9.70 

423301 S67580 Hs.1 645 cytochrome P450, subfamily IVA, potypept 9.67 

440757 AW1 18645 Hs.160004 ESTs 9.67 

441412 AI393657 Hs.159750 ESTs 9.66 

421044 AF061871 Hs.101302 coDagen, type XII, alpha 1 9.66 
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414726 BE466863 Hs280099 ESTs 9.66 

418465 R91679 Hs.124981 ESTs 9.66 

433460 X02422 Hs.1 81 125 immunoglobulin lambda locus 9.65 

441530 AI248301 Hs.127112 ESTs 9.65 

5 433533 053304 Hs.65394 ESTs 9.65 

421470 R27496 Hs.1378 araw»nA3 9.64 

438613 C05569 Hs£43122 hypothetical protein FU13Q57 similar to 9.64 

429324 M488101 Hs.199245 inactivation escape 1 9.62 

450244 AA007534 Hs.125062 ESTs 9.62 

10 407660 AW063190 Hs.279101 ESTs 9.61 

406554 9.60 

426404 AA377607 Hs.273138 ESTs 9.58 

447045 AW392394 Hs.278569 KIAA0064 gene product 958 

449894 AK001578 Hs.24129 hypothetical protein FU 1071 6 9.58 

15 448376 AI494332 Hs.196963 ESTs 9.58 

407902 AL117474 Hs.41181 Homo sapiens mRNA; cDNA DKFZp727C 191 (tr 956 

446572 AV659151 Hs.282961 ESTs 9.56 

459245 BE242623 Hs.31939 manic fringe (Drosophaa) homolog 9.55 

423545 AP 000692 Hs.129781 chromosome 21 open reading frame 5 954 

20 414697 BE266134 Hs.76927 translocase of outer mitochondrial membr 9.54 

410846 AW807057 gb:MR4-ST0062-O31199-Ol8-b03 ST0062 Homo 952 

421 181 NIUL005574 Hs.184585 LIM domain only 2 (rhombotirHike 1) 9.52 

427308 D26067 Hs.174905 KIAA0033 protein 9.52 

415995 NMJXM573 Hs.994 prx>sphoiipase C, beta 2 951 

25 434846 AW295389 Hs.1 19768 ESTs 951 

414342 AA742181 Hs.75912 Homo sapiens cDNA: FU22199 fis, clone H 950 

416959 D28459 Hs.B0612 ubiquitin-conjugatmg enzyme E2A (RAD6 h 950 

443123 AA094538 Hs.6588 ESTs 9.50 

439312 AA833902 Hs.270745 ESTs 9.48 

30 449375 R07114 Hs.271224 ESTs 9.48 

436357 AJ132085 gb:Homo sapiens mRNA for axonemal dyneln 9.44 

458723 AW137726 Hs.244352 ESTs, Moderately similar to laminin aljph 9.44 

457526 AW450584 Hs.192131 ESTs, Weakly similar to RIBB (H.sapiensj 9.43 

404741 9.43 

35 422409 NM_0Q5428 Hs.1 16237 vav 1 oncogene 9.43 

403708 9.42 

408806 AW847814 Hs.289005 Homo sapiens cDNA: FU21532 ffs, clone C 9.42 

417380 T06809 gb:EST04698 Fetal brain, Stratagene (cat 9.42 

422501 AA354690 Hs.144967 ESTs 9.42 

40 426197 AA004410 Hs.167835 acyl-Coenzyme A oxidase 1 , palmitoyl 9.42 

452624 AU076606 Hs.30054 coagulation factor V (proaccelerin, labi 9.42 

412110 AW893569 > gb:RC0-NN0021-04040(M)21-c10 NN0Q21 Homo 9.41 

414158 AA361623 Hs£88775 Homo sapiens cONA FU 1 3900 fis, clone TH 9.41 

408101 AW968504 Hs.123073 CDC2-related protein kinase 7 9.40 

45 414171 AA360328 Hs.865 RAP1A, member of RAS oncogene family 9.40 

415947 U04045 Hs.78934 mutS (E. coli) homolog 2 (colon cancer, 9.40 

426959 BE262745 gb:601 153869F1 NIH_MGC_1 9 Homo sapiens c 9.39 

417519 AI689987 Hs.177669 ESTs, Weakly similar to RMS1_HUMAN REGUL 9.39 

457181 BE514362 Hs.296422 FK506-binding protein 3 (25kD) 939 

50 402835 9.38 

404632 9.38 

446566 H95741 Hs.17914 Homo sapiens cONA: FU22801 fis, clone K 9.37 
455369 AW903533 gb:CM1-NN 1031 -060400-1 78-d05 NN1031 Homo ■ 9.37 

444001 AI095087 Hs.152299 ESTs, Moderately similar to ALU5_HUMAN A 9.36 

55 458191 AI420611 Hs.127832 ESTs 9.36 

431374 BE258532 H<l251871 CTP synthase 934 

429327 AA283981 Hs.199248 prostaglandin E receptor 4 (subtype EP4) 9.33 

407061 X97748 gbRsapiens PTX3 gene promotor region. 9.33 

416967 BE616731 Hs.80645 interferon regulatory factor 1 9.33 

60 423013 AW875443 Hs22209 secreted modular calcium-binding protein 9.33 

439461 AA693960 Hs.103158 ESTs 9.33 

418830 BE513731 Hs.88959 Human DMA sequence from done 967N21 on 932 

422763 AA033699 Hs.83938 ESTs, Moderately.simnar to MASP-2 [H.sa 9.32 

442739 NMJJ07274 Hs.8679 cytosofic acyt coenzyme A thfoester hydr 9.32 

65 452859 AI300555 Hs^88158 Homo sapiens cDNA: FU23591 fis. clone L 9.32 

403237 932 

415000 AW025529 Hs*39812 ESTs, WeaJdy similar to CALM_HUMAN CALMO 9.31 

417951 AW976410 Ha289069 Homo sapiens cDNA:FLJ2 101 6 fis, clone C 8.30 

419066 298492 Hs.6975 PRO1073 protein 930 
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448443 AW167128 Hs531934 ESTs 930 

405125 930 

409768 AW499566 gb:UWF^R0p^03-O4J).r1 NIH_MGC_5 958 

453708 AI191811 Hs.54629 ESTs 958 

5 442271 AF000652 Hs.8180 syndecan binding protein (syntenin) , 957 

410055 AJ250839 Hs£8241 gene for serine/threonine protein kinase 956 

448692 AW013907 Hs524276 ESTs, Moderately simtor to predicted us 956 

417381 AF164142 Hs.82042 solute carrier famSy 23 (nudeobase tra 955 

422497 029642 Hs.1528 KIAA0053 gene product 955 

10 414140 AA281279 Hs53317 ESTs 954 

435980 AF274571 Hs.1 29142 ESTs; Weakly similar to DEOXYRIBONUCLEAS 954 

458530 BE395035 Hs.1 99889 ESTs, Weakly similar to KIAA0874 protein 954 

402585 954 

420819 AA280700 gb:zs95h1 U1 NCLCGAPJ5CB1 Homo sapiens 953 

15 444755 AA431791 Hs.183001 ESTs 952 

411630 U42349 Hs.71119 Putative prostate cancer tumor suppresso 952 

421246 AW582962 Hs.300961 ESTs, Highly similar to AF1 51 805 1 CGI-4 950 

421924 BE514514 Hs.1 09606 corontn, actin-tinding protein, 1 A 9.19 

414888 AL039185 Hs.77558 thyroid hormone receptor inleractor 7 9.18 

20 434267 AI206589 Hs.116243 ESTs 9.17 

409213 U61412 Hs.51133 PTK6 protein tyrosine kinase 6 9.17 

428242 H55709 Hs5250 leukemia inhibitory factor (cholinergic 9.16 

451736 AW080356 Hs593684 ESTs, WeaWy similar to alternatively sp 9.15 

413627 BE182082 Hs546973 ESTs 9.14 

25 416134 AA528402 Hs.74861 activated RNA polymerase II transcriptio 9.14 

449251 AW151660 Hs.31444 ESTs 9.14 

452813 U54727 Hs.191445 ESTs 9.14 

443622 AI911527 Hs.11805 ESTs 9.14 

413260 BE075281 gb:PM1-BT0585-2902(XM)OM07 BT0585 Homo 9.12 

30 413450 Z99716 Hs.75372 N-acetygalactosamiriidase, atpha- 9.12 

446442 BE221533 Hs557858 ESTs 9.12 

438540 AA810021 Hs.136906 ESTs 9.12 

426251 M24283 Hs.1 68383 Intercellular adhesion molecule 1 (CD54) 9.11 

410290 AA402307 Hs.73818 ubiquinol-cytochrome c reductase hinge p 9.10 

35 437398 AA913736 Hs.126715 ESTs 9.10 

421559 NM 014720 Hs. 105751 Ste20-related serine/threonine kinase 9.10 

439699 AF086534 Hs.187561 ESTs, Moderately similar to ALU INHUMAN A 9.10 

430799 C19035 Hs.164259 ESTs 9.09 

424544 M88700 Hs. 1 50403 dopa decarboxylase (aromatic L-aminoaci 9.08 

40 453942 AW190920 Hs.19928 ESTs 9.08 

425844 T68073 Hs.159628 serine (or cysteine) proteinase inhibito 9.08 

434658 A1624436 Hs.194488 ESTs 9.07 

453999 BE328153 Hs540087 ESTs 9.06 

436490 R71543 Hs.18713 ESTs 9.05 

45 409192 AA065131 HS533439 ESTs, WeaWy similar to ALU7_HUMAN ALU S 9.05 

446223 BE300091 Hs.119699 hypothetical protein RJ12969 9.04 

447247 AW369351 Hs587955 Homo sapiens cONA FU13090 hs, done NT 9.04 

450094 AI174947 Hs595789 Homo sapiens mRNA; cDNA DKF2p564D1164 (f 9.04 

432012 AW301344 Hs.195969 ESTs 9.04 

50 422520 AU076730 Hs.1 17977 kinesin 2 (60-70kD) 9.02 

418650 BE386750 Hs.86978 prolyl endopeptklase 9.02 

423008 M81590 Hs.123016 5-hydroxytryptamine (serotonin) receptor 9.02 

436476 AA326108 Hs.53631 ESTs * 9.02 

448206 BE622585 Hs.3731 ESTs 9.02 

55 431574 AW572659 Hs561373 adenosine A2b receptor pseudogene 9.01 

443453 R99876 Hs569882 ESTs 9.01 

435472 AW972330 Hs583022 triggering receptor expressed on myeloid 9.01 

420337 AW295840 Hs.14555 Homo sapiens cONA:FU21513fis, done C 9.00 

449810 AB008681 Hs53994 activin A receptor, type I1B 9.00 

60 406780 AA9Q2386 Hs586 ribosomal protein L4 8.99 

* 429169 AW341130 Hs.197757 ESTs, Moderately similar to FGFE_HUMAN F 8.99 

421326 " AF051428 Hs.103504 estrogen receptor 2 (ER beta) 8.97 

425491 AA883316 Hs555221 ESTs 8.96 

425516 BE000707 HS59567 ESTs 8.95 

65 439773 AI051313 Hs.143315 ESTs 8.96 

443247 BE614387 Hs.47378 ESTs 8.96 

456623 AI034125 Hs.108106 transcription factor 8.95 

438707 L08239 Hs.5326 porcupine 855 

402240 8.95 
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444152 AI125694 Hs. 14 9305 Homo sapiens cDNA RJ14264 fis, done PL 8.95 

409842 AW501756 gb:UI-HF.BR0p*jn><>09-<HJI.r1 NIH.MGC.5 8.94 

416277 W78765 Hs.73580 ESTs 8.94 

456697 A1908006 Hs.111334 ferritin, fight polypeptide 8.94 

5 410762 AF226053 Hs.66170 HSKM-B protein 8.92 

412942 AL120344 Hs.75074 mftogen-adivated protein kinase-activat 8.92 

442320 AI287817 Hs.129636 ESTs 8.92 

449673 AA002064 Hs.18920 ESTs 8.91 

411486 N85785 Hs.181165 eukaryotic translation elongation lactor 8.90 

10 437916 BE566249 Hs20999 Homo sapiens cDNA: RJ23142 fis, done L '8.90 

442732 AA257161 Ks.8658 hypothetical protein DKFZp434E0321 8.89 

419741 NM_007O19 Hs.93002 ubtqurtin carrier protein E2-C 8.89 

411499 AW849292 gb:IL3^0215<)2030(H)90-E06CT0215Horno 8.89 

431154 AW971228 Hs.290259 ESTs 8.89 

15 414922 D00723 Hs.77631 grydne deavage system protein H (amino 8.88 

418036 Z37976 Hs.83337 latent transforming growth factor beta b 8.87 

406422 8.87 

422926 NMJH6102 Hs.121748 ring finger protein 16 8.87 

435220 D50030 Hs.104 HGF activator 8.86 

20 418203 X54942 Hs.83758 CDC28 protein kinase 2 8.86 

418613 AA744529 Hs.86575 mitogen-activated protein kinase kinase 8.85 

439250 H66566 Hs.271711 ESTs 8.85 

432359 AA076049 Hs£74415 Homo sapiens cDNA RJ10229 Ms, done HE 8.84 

450000 A1952797 Hs.10888 .Homo sapiens cDNA: FU21559 fis, done C 8.83 

25 425657 T89839 Hs.1 19471 ESTs 8.83 

425694 U51333 Hs.159237 hexokinase 3 (white celt) 8.82 

419972 AL041465 Hs.294038 ESTs, Moderately similar to ALU2_HUMAN A 8.82 

436396 AI683487 Hs.299112 Homo sapiens cDNA FLJ11441 fis, done HE 8.82 

413413 D82520 Hs.301834 Homo sapiens cDNA RJ1 0952 fis, done PL 8.82 

30 428807 AA435997 Hs.104930 ESTs 8.82 

415839 R40611 Hs.137565 ESTs 8.81 

419553 N34145 Hs.250614 ESTs 8.80 

420309 AW043637 Hs.21766 ESTs 8.80 

421863 AI952677 Hs.108972 Homo sapiens mRNA; cDNA DKFZp434P228 (fr 8.80 

35 447965 AW292577 Hs.94445 ESTs 8.80 

459172 BE063380 gb:PM0^T0275-291099K)02-g1OBT0275Homo 8.80 

403259 8.78 

411534 AW850473 gb:IL3-CT0219-28010O061-Bt1 CT0219Homo 8.78 

456161 BE264645 Hs.282093 Homo sapiens cDNA: FLJ21918 fis, done H 8.77 

40 413654 AA331881 Hs.75454 peroxiredoxfn 3 8.76 

401744 8.76 

425348 AL137477 Hs.155912 cadherMke 24 8.76 

423396 AI382555 Hs.127950 bramodomain-containing 1 8.75 

450649 NM.001429 Hs.297722 Human DNA sequence from done RP1-85F18 8.75 

45 408331 NM.007240 Hs.44229 dual spedftoty phosphatase 12 8.74 

423872 AB020316 Hs.134015 uronyt 2-sulfotransferase 8.74 

424906 AI566086 Hs.153716 Homo sapiens mRNA for Hmob33 protein, 3 8.74 

427596 AA449505 Hs.1 79765 Homo sapiens mRNA; cON A DKFZp586H 1921 (f 8.73 

432488 AA551010 Hs.216640 ESTs 8.72 

50 448980 AL137527 Hs.22703 Homo sapiens mRNA; cDNA DKFZp434P1018 (f 8.72 

429455 AI472111 Hs£92507 ESTs 8.71 

429855 AW385597 Hs.138902 ESTs, WeaWy sfmBar to B34087 hypotheti 8.71 

441746 H59955 Hs.127829 ESTs ' 8.70 

411945 AL033527 Hs.92137 v-myc avian myelocytomatosis viral onoog 8.70 

55 413492 D87470 Hs.75400 KIAA0280 protein 8.70 

435706 W31254 Hs.7045 GL004 protein 8.70 

433741 AA609019 Hs.159343 ESTs 8.70 

426340 Z97989 Hs.169370 FYN oncogene related to SRC, PGR, YES 8.69 

422779 AA317036 Hs.41989 ESTs 8.67 

60 449785 AI225235 Hs.288300 Homo sapiens cDNA: FU23231 lis, done C 8.67 

420144 AA811813 Hs.119421 ESTs 8.66 

420235 AA256756 Hs.31178 ESTs 8.66 

432606 NM_002104 Hs.3066 granzyme K (serine protease, granzyme 3; 8.66 

.425762 BE244076 Hs.159578 Homo sapiens mRNA for FU00020 protein, 8.65 

65 427448 BE246449 Hs.2157 Wiskott-Aldrich syndrome (eczema-thrombo 8.64 

418033 W68180 Hs£59855 Homo sapiens cDNA FU 12507 fe, done NT 8.64 

429084 AJ001443 Hs.195614 spiidng factor 3b, subunit 3, 130kD 8.64 

417094 NM.006895 Hs.81182 histamine N-methyttransferase 8.64 

457277 NM_004736 Hs.227656 xenotropic and potytropic retrovirus rec 8.63 
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422631 BE218919 Hs.1 18793 hypothetical protein FUI 0688 8.63 

410579 AW795196 Hs215857 ring fingerprotein 14 8.63 

431585 BE242803 Hs.262823 hypothetical protein FU1 0326 8.62 

401851 8.62 

5 401866 8.62 

407783 AW996872 Hs.172028 a disintegrin and metalloprotelnase doma 8.62 

408242 AA251594 Hs.43913 PIBF1 gene product 8.62 

422250 AW408530 Hs.1 13823 CIpX (caseinolytic protease X, E. coli) 8.62 

430259 BE550182 Hs.127826 RalGEF-like protein 3, mouse homolog 8.62 

10 452598 AI831594 Hs.68647 ESTs, Weakly similar to ALU7_HUMAN ALU S 8.62 

419541 AW749617 abflC3-BT0502-1 301 00-01 2-Q07 BT0502 Homo 8.60 

428839 AI767756 Hs.82302 ESTs 8.60 

429328 AA829402 Hs.47939 ESTs 8.60 

451491 AI972094 Hs.286221 Homo sapiens cDNA FU13741 fis, clone PL 8.60 

15 452561 AI692181 Hs.49169 KIAA1634 protein 8.60 

420027 AF009746 Hs.94395 ATP-binding cassette, sub-family D (ALD) 8.60 

435205 X54136 Hs.181125 immunoglobulin lambda locus 8.60 

430900 U91939 Hs.248123 G protein-coupled receptor 25 8.60 

405074 8.59 

20 437991 AI479773 Hs.181679 ESTs 8.59 

436346 BE328882 Hs.193096 ESTs, Moderately similar to U1 19_HUMAN U 8.58 

411079 AA091228 gb:cchn2152.seq.F Human fetal heart, Lam 857 

418452 BE379749 Hs.85201 C-type (calcium dependent, carbohydrate- 856 

429109 AL008637 Hs.1 96352. neutrophil cytosolic factor 4 (40kD) 856 

25 448019 AW947164 Hs.195641 ESTs 8.56 

449865 AW204272 Hs.199371 ESTs 8.55 

. 431 1 80 H55883 gbryq94h03.M Soares fetal liver spleen 8.54 

445988 BE007663 Hs.13503 inactivation escape 2 8.54 

405876 8.54 

30 407235 D20569 Hs.169407 SAC2 (suppressor of actin mutations 2, y 854 

414807 AI738616 Hs.77348 hydroxyprostaglandin dehydrogenase 15-(N 8.54 

425671 AF193612 Hs.159142 lunatic fringe (Drosophila) homolog 8.54 

452413 AW082633 Hs.212715 ESTs 8.54 

421620 AA446183 Hs.91885 ESTs 853 

35 444539 AI955765 Hs.146907 ESTs 852 

415102 M31899 Hs.77929 excision repair cross-complementing rode 851 

405552 851 

418068 AW971155 Hs.293902 ESTs, Weakly similar to prolyl 4-hydroxy 850 

420133 AA426117 Hs.14373 ESTs 850 

40 438887 R68857 Hs^65499 ESTs 850 

446468 AI765890 Hs.16341 ESTs; Moderately similar to l!U ALU SUB 850 

446585 AV659397 Hs.282948 ESTs 8.50 

441896 AW891873 gb^M3-ffT0090^40500-173^rn'0090HoiTO 850 

437718 AI927288 Hs.196779 ESTs 8.48 

45 420656 AA279098 Hs.187636 ESTs 8.48 

429303 AW137635 Hs.44238 ESTs 8.48 

450624 AL043983 Hs.125063 Homo sapiens cONA FU13825 fis, ctone TH 8.48 

452573 A1907957 Hs.287622 Homo sapiens cDNA FU14082 fis, clone HE 8.48 

456341 AA229126 Hs.122647 N-myristoyltransferase 2 ' 8.48 

50 423024 AA593731 Hs.75613 CD36 antigen (collagen type I receptor, 8.47 

446985 AL038704 Hs.156827 ESTs, Weakly similar to ALU 1 .HUMAN ALUS 8.46 

431778 AL080276 Hs.268562 regulator of G-protein signalling 17 8.46 
400268 " 8.46 

421828 AW891965 Hs.289109 diniemylargirimedimemylaminohydrolase 8.45 

55 417022 NM 014737 Hs.B0905 Ras association (RalGDS/AF-6) domain fam 8.44 

421029 AW057782 Hs.293053 ESTs 8.44 

425171 AW732240 Hs^00615 ESTs 8.44 

459070 AI814302 gb:wj71c12.x1 NCLCGAP_Lu19 Homo sapiens 8.42 

406006 8.42 

60 412643 AW971239 Hs.293982 ESTs 8.42 

424775 AB01454O Hs.153026 SWAP-70 protein 8.42 

446848 AW136083 Hs.195266 ESTs, WeaWy similar to S59501 Interfere 8.42 

448043 A1458653 Hs£01881 ESTs 8.41 

407183 AA358015 gb:EST66864 Fetal lung III Homo sapiens 8.40 

65 412324 AW978439 Hs.69504 ESTs 8.40 

419594 AA013051 Hs.91417 topoisomerase (DNA) II binding protein 8.40 

430968 AW972830 gb:EST384925 MAGE resequences, MAGL Homo 8.40 

431689 AA305688 Hs.267695 UDP-GalftetaGtcNAc beta 1,3-galactosyflr 8.40 

438582 AI521310 Hs.283365 ESTs, Weakly similar to ALUSJiUMAN ALU S 8.40 
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447685 AL122043 Hs.19221 hypoftetical protein DKFZp566G1424 8.40 

459119 AW844498 Hs289Q52 Homo sapiens LENG8 mRNA, variant C, part 8.38 

400817 8.37 

4252B5 BE245297 gb:TCBAP1E2482 Pediatric pre-B can acut 8.37 

5 409385 AA071267 gb2m61g01.fi Stratagene fibroblast (937 8.36 

439121 BE047779 Hs.44701 ESTs 8.38 

419968 X04430 Hs.93913 interteukin 6 Onterferon, beta 2) 8.36 

408327 AW182309 Hs249963 ESTs, Highly similar to dJ1 170K4.4 [H.sa 8.35 

403976 8.34 

10 448064 AA379036 gb:EST91809 Synovial sarcoma Homo sapien 8.33 

442914 AW188551 Hs.99519 Homo sapiens cDNA RJ 14007 fc, done Y7 8.33 

428032 AW997704 Hs.11493 Homo sapiens cONA FIJI 3536 fis, clone PL 8.32 

434194 AF1 19847 Hs283940 Homo sapiens PRO1550 mRNA, partial cds 8.32 

458677 AW937670 Hs254379 ESTs 8.32 

15 420925 NM_015698 Hs.100391 T54 protein 8.30 

416475 770298 gb:yd26g02.s1 Soares fetal fiver spleen 8.30 

416852 AF283776 Hs.80285 Homo sapiens mRNA; cONA DKF2p586C1723 (f 8.30 

430676 AF084866 gb:Homo sapiens envelope protein RIC-3 ( 8.30 

428455 AI732694 Hs.98520 ESTs 829 

20 435343 AW194962 Hs.199028 ESTs 829 

4507B3 BE266695 gb:60119Q242F1 NIH_MGC J Homo sapiens cD 829 

404946 828 

422942 AF054839 Hs.122540 tetraspan2 828 

453716 AA037675 Hs.152675 ESTs 828 

25 437098 AA744488 Hs.132842 ESTs, Moderately similar to ALU 1_HUMAN A 828 

443907 AU076484 Hs.9963 TYRO protein tyrosine kinase binding pro 827 

401930 AF106069 Hs23168 ubtqurtin specific protease 15 826 

446554 AA151730 Hs.301789 ESTs, WeaWy similar to similar to C.ete 826 

426290 AB007918 Hs.169182 KIAA0449 protein 825 

30 419904 AA974411 Hs.18672 ESTs 825 

413886 AW958264 Hs.103832 ESTs, Weakly similar to TRHY_HUMAN TRICH 824 

424738 AI963740 Hs.46826 ESTs 824 

427359 AW020782 Hs.79881 Homo sapiens cDNA: FLJ23006 fis, clone L 824 

424534 D87682 Hs.150275 KIAA0241 protein 824 

35 424429 U63830 Hs.146847 TRAF family member-associated NFKB activ 824 

442604 BE263710 Hs279904 ESTs 822 

■ 442992 AI914699 Hs.13297 ESTs 822 

427210 BE396283 Hs.173987 eukaryotic translation imitation factor 822 

457229 BE222450 Hs266390 ESTs 821 

40 423730 AA330214 gb:EST33935 Embryo, 12 week II Homo sapi 821 

411928 AA888624 Hs.19121 adaptor-related protein complex 2, alpha 820 

416051 AA835868 Hs25253 Homo sapiens cONA: FU20935 fis, clone A 820 

417231 R40739 Hs21326 ESTs 820 

422049 W25760 Hs.77631 glycine cleavage system protein H (amino 820 

45 427528 AU077143 Hs.179565 minichromosorne maintenance deficient (S. 820 

458776 AV654978 Hs.19904 cystathionase (cystathionine gamma-lyase 8.19 

417687 AI828596 Hs250691 ESTs 8.18 

423218 NM.015896 Hs.167380 BLu protein 8.18 

425397 J04088 Hs.156346 topoisomerase (DNA) !l alpha (170kD) 8.18 

50 406964 M21305 Hs247946 Human alpha satellite and satellite 3 ju 8.18 

402401 U42349 HsJ1119 Putative prostate cancer tumor suppresso 8.18 

423397 NMJJ01838 Hs.1652 chemokine (C-C motif) receptor 7 8.18 

427857 AL133017 Hs2210 thyroid hormone receptor interactor 3 - 8.17 

401519 8.17 

55 447188 H65423 Hs.17631 Homo sapiens cDNA FLJ201 18 fis, clone CO 8.16 

424704 AI263293 Hs.152096 cytochrome P450, subfamily IU (arachido 8.16 

435854 AJ278120 Hs.4996 DKFZP564D 166 protein 8.14 

448556 AW885606 Hs.5064 ESTs 8.14 

449217 AA278536 Hs23262 ribonuclease, RNase A family, k6 8.14 

60 453124 AI139058 Hs23296 ESTs 8.14 

442812 A1018406 Hs.131284 ESTs 8.14 

421129 BE439899 Hs.89271 ESTs 8.14 
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TABLE 9A shows the accession numbers for those primekeys lacking a unigenelD in Table 
9. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset identifier number 

CAT number. Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



408057 1035720,-1 AW139555 

408069 103655J H81795 242291 R20973AA046920 

408182 , 104479.1 AA047B54 AA057506 AA053841 

408338 1052148.1 AW867079 AW867086 AW1 82772 

408828 108463 1 BE540279 AW410659 AA057857 R77693 BE278674 

409126 1 10159J AA063426 AW962323 AW408063 AA063503 AA772927 AW753492 BE175371 AA31 1 147 

409292 111586J AA071051 AA070584 AA069938 AA102136 AA074430 

409314 111841.1 AA070266M084967AA126998 

409385 112523 1 AA071267 T65940 T64515 AA071334 

409398 1126716 1 AW386461 AW876408 AW386672 AW386599 AW876258 AW386619 AW386289 AW876136 AW876203 AW876213 AW876301 

AW876295 AW876349 AW876365 AW876160 AW876369 AW876352 AW876271 

409671 114731J AA076769 AA076781 AI087968 

409768 1154035 1 AW499566 AW502378 AW499522 AW502046 AW502671 AW501917 AW501868 AW501721 AW502813 

409841 1156088.1 AW502139 AW502432 AW502235AW501683 AW502647 

409842 1156119 1 AW501756 AW502096 AW502465 AW501715 

409853 1156226.1 AW502327 AW502488 AW501829 AW502625 AW502687 

410531 1207200 1 AW752953 H88044 BE 156092 

410688 1216101 1 AW796342AW796356BE161430 

410846 1223902.1 AW807057 AW807054 AW807189 AW807193 AW807369 AWB07429 AW807364 AW807365 AW807078 AW807256 AW807180 
AW807331 

410896 1226053.1 AW809637 AWB09697 AW810554 AW809707 AW809885 AW810000 AW810088 AW809742 AW809816 AW809749 AW809639 
AW809722 AW809836 AW809774 AW81 0023 AW810013 AW809813 AW809660 AW809728 AW809768 AWB09951 AW809657 
AW809954 

411079 123128J AA091 228 H71 860 H7 1073 

411424 1245497 1 AW845985 AW845991 AW845962 

41 1499 1248105 1 AW849292 AW849431 AW849422 AW849428 AW849420 AW849424 AWB49427 

411507 1248607 1 AWB50140 AW850195 AW850192 

411534 1248827.1 AW850473 AW850471 AW850431 AW850523 

411972 1268491.1 BE074959 AW880160 

412110 1277844 1 AW893569 AW893571 AWB9358B AWB93593 

412226 1284289 1 W26786 AW998612 AW902272 

412257 1285376 1 AW9038308E071916 

412405 1293012 1 AW948126 AW948139 AW948196AW948145 AW948162 AW948134 AW948127 AW948124AW948153 AW948157 AW948125 

AW948131 AW948158 AW948164 AW948151 

413260 1356003 1 BE075281 BE075219 BE075123 BE0751 19 BE075046 

413471 1371778J BE142098 BE142092 

413729 1385114 1 BE 159999 BE 160056 BE1 60 107 BE 1601 39 

414182 142409.1 AA1 36301 AI38 1776 AA1 36321 

414989 1511339.1 T8 1 668 C1 9040 C1 7569 

415354 1534763.1 F06495 R24336 R 13046 

416011 1566439.1 H14487 R50911 Z43216 

416475 1596398 1 T70298 K58072 R02750 

417380 1672461.1 T06809 N75735 

419392 1843934.-1 W28573 

419541 185724 1 AW749617 R64714 AA244138 AA244137 BE094019 

419544 185760.2 AI909154 AA526337 AA244193AI909153 

420819 196721.1 AA280700 AW975494 AA687385 

421245 200620 1 AA285363 AA285333 AA285359 AA285326 AA285350 

422673 219674.1 N59027AA31 4694 N53937 R08 100 
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422695 
422858 
422940 
423730 
423790 
424385 
424606 
425265 



430676 



430968 
431180 
432093 
434596 
436357 
437159 
437495 
439097 
439120 
440134 
441896 
445629 
447229 
448064 
450783 
451045 
452549 
452560 



219996.1 
222209 1 
223106.1 
231 462 J 
232031.1 
238731J 
241409J 
249175J 
273830.-1 
32168J 



326269.1 

328906.1 

341283J 

38937.1 

41842.1 

43393.1 

43765.1 

46858.1 

46879.1 

48675.1 

52842.1 

645767.1 

71288.1 

74761.1 

84655.1 

85673.1 

921802.1 

922216.1 



452712 928309.1 

453758 980026.1 

454093 1007366.1 

454563 1224342.1 

454791 1234759 1 

454977 1247099.1 

455131 1254674.1 

455183 1259023.1 

455254 1266449.1 

455369 1285173.16 

455982 1396849.1 

456011 1410860J 



456023 
457586 
457595 
457751 
459070 
459081 
459145 
459172 
459234 



1416335.1 

360505.1 

364225.-1 

399422.1 

883688.1 

889426.1 

918957.1 

921149.1 

945240.-1 



AA315158 AW961298 N76067 AW802759 AI858495 W04474 

R35398BE252178AA318153 

BE077458 AA337277 AA319285 

AA3302H AW962519 T54709 

BE152393 AA330984 BE073904 

AA339666 AW952809 AA349119 

AA343936 AA344060 AW963081 

BE245297 AA353976 AW505Q23 

BE262745 

AF084866 AF084870 AF084864 AF084867 AF084869 AF084865 AF084868 AW818206 AW812038 BE144813 BE144812 

AW812041 AW812040 AW81 2067 BE061 583 BE061 604 TG5808AI352469 AA580921 BE141783 BE141782 BE061601 

AW814393AW885029 

AW972830 AA527647 AA489820 AA570362 

H55B83 AW971249 AA493900 H55788 

H28383 AW972670 H28359 AA525808 

T59538 T59589 T59598 T59542 AF147374 

AJ132085 Z83805 

AL050072AW900148 

BE177778 BE177779 AL390180 AA359908 

H66948AF085954H66949 

H56389AF085977 H56173 

BE410734 BE5601 17 BE270054 BE296330 BE267957 AI003007 BE545259 

AW891873 AW891897 BE564764 

AI245701 BE272724 

BE617135AW504051 AW504283 

AA379036 AA150589 A1696854 BE621316 

BE266695 BE265474 N53200 BE267333 

AA215672 AI696628 AA013335 H86334 AA017006 

AI907039 AI907081 

BE077084 AW139963 AW863127 AW806209 AW806204 AW806205 AW806206 AW80621 1 AW806212 AW806207 AW806208 
AW806210AI907497 

AW838616 AW838660BE144343 AI914520AW888910 BE184854 BE184784 
U83527AL120938U83522 

AW860158 AW862385 AW860159 AW862386 AW862341 AW821869 AW821893 AW062660 AW062656 
AW807530 AW807540 AW807537 AW846086 BE141634 AW846089 AW807499 AW807533 AW838499 
BE071874 BE071882 AW820782 AW821007 

AW848032 AW848630 AW848478 AW848623 AW848484 AW848169 AW848830 AW848149 AW8481 19 AW848893 AW848903 
AW848407 

AW857913 AW857916 AW857914 AW861627 AW861626 AW861624 
AW984111 AW863918AW863856 

AW877015 AW877133 AW876978 AW877071 AW876988 AW877069 AW877063 AW877013 

AW903533 AW903516 AW903562 BE0652Q2 BE085215 BE085214 BE085209 BE085172 BE085175 BE085193 BE085211 
BE085199 

BE176862 BE176876 BE176947 BE176878 

BE243628 BE246081 BE247016 BE241984 BE241534 BE246091 BE245679 BE243620 BE245998 BE242329 BE241417 

BE241457 BE242522 BE241989 BE241464 

R00028BE247630 

AW062439 AW751554 AA579463 

AA584854 

AI908236AA663731 

AI814302AI814426 

W07808AI822066 

AI903354 AI9Q3489 Al 90348 8 

BE063380 BE063346 AI906097 

AI940425 
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TABLE 9B shows the genomic positioning for those priraekeys lacking unigene ID's and 
accession numbers in Table 9. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers In this column are Genbank Identifier (Gl) numbers. "Dunham L et at" refers to the 

publication entitled "The DNA sequence of human chromosome 22." Dunham I. et at, Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

Nt_position: indicates nucleotide positions of predicted exons. 



Pkey Ref 


Strand 


Nt_posHion 


APfiAZD R1 13550 


Minus 


90308-90505 


400557 9801261 


Pius 


208453-208528,209633-209813 


£1)061 S QQQ8QQ4 


Plus 


118036-118166 118681-118807 


AOORCO R567R67 


Minus 


174571-174856 


Anna 17 rrrqqca 


Plus 


170793-170948 


iiYtAAO QQ311P1 


Pius 


29235-29336 36363-36580 


4VAJ0O0 yjOO 10/ 


Minus 






Minus 


cpnqQ.cp-icfl ^qcr-MIPO 54957-55052 55420-55480.56452-56666 57221-57718 


/(YKJCO 7CCQXQ1 


Plus 


192667-192826 194387-194876 

i%vu( iatutv ( iv*ww< i«iv(v 




Plus 


159197-159320 


4U1U44 Oil/Ola 


PIlK 


73501-73674 


Wile** OO/UtoO 


MiniiQ 

(VHllllo 


124181-124391 


4UIIOO oao loiiU 


Plus 




4A10ni Q7A^5Q7 
4Ul£,Ul if/*»OOOf 


Mimic 


138534-138629 139234-139294 140121-140335 142033-142479 


4Ulcot> 30ul«yt£ 


Minus 


14703fi-14731R 


4U1004 ooouyoy 


Minus 




4U14O0 04OO0ZD 


PllIC 


1*Vl5fi-134ft9 


^nic-ic; 7fi'?fw;i 

4U1D10 /OOUODI 


PllIC 


99990-301 26 


4Uioiy owyjio 


Pit m 




Af\iK70 OfiQRIQR 

4Uio/c yoooioo 


PllIC 


1 98596- 1 98704 1 30755-1 30860 

ItOO^D ItOflrt, l«X/(03 IOUCKJU 


4UI/44 co/oo^y 


Pius 


14595-14751 


4rvtQC1 777f\49£ 


Minus 


146443-146664 147794-147971 148351-148480148980-149111 149801-149949 


401 odd oOlolUo 


Pius 




iinooyiA 7como< 
4utx4U /DaUlol 


Ptnc 


1043n9. 104597 Mfi1 36-1 0637? 


4uzooy y<£i itU4 


Minus 






Minus 


174893-175050 183210-183435 


402788 979B102 


Plus 


98273-101430 


402802 3287156 


Minus 


53242-53432 


402812 6010110 


Plus 


25026-25091^5844-25920 


402828 8918414 


Plus 


69071-69642 


402835 9187337 


Plus 


26961-27101 


402838 9369121 


Minus 


32589-32735,35478-35666 


402842 9369121 


Minus 


76355-76479 


402895 9967547 


Plus 


85537-85671,86379-86469 


402964 9581599 


Minus 


46624-46784 


403137 9211494 


Minus 


92349-92572,92958-930^,93579-93712,93949-94072^91-W 


403237 7637807 


Plus 


7271-7527 


403259 7770585 


Plus 


4693-4857 


403683 7331517 


Plus 


217175-217446 


403690 7387384 


Minus 


78627-79583 


403708 5705981 


Minus 


134394-134812 


403838 4176355 


Plus 


19197-19502 


403851 7708872 


Plus 


22733-23007 


403976 7657840 


Plus 


24755-24969 


404407 7329316 


Minus 


48154-48499 


404426 7407959 


Plus 


77842-77954 


404632 9796668 


Plus 


45096-45229 


404741 8574139 


Plus 


143025-143487 


404756 7706327 


Pius 


82849-83627 


404946 7382189 


Plus 


134445-134750 


405074 7770440 


Plus 


44340-44559,44790-45059 


405125 8247873 


Plus 


137113-137814 


405172 9966752 


Plus 


153027-153262 
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405236 


7249076 


Minus 


151 099*151915 


405325 


6094661 


Minus 


oca io ocoofi 
25816*2o3oU 


405411 


3451356 


Minus 


1 7503- 1 7778, 1 8021 - 1 8290 


405495 


8050952 


Minus 


72182*72373 


405552 


1552506 


Plus 


45199-45647 


405601 


5815493 


Minus 


4il700C -MTA'SC -OlftOOA^ 4AOOO 

1 47835-147935,1 49ZZ0-149299 


405685 


45QB129 


Minus 


•r7Qce Conor? 


405777 


7263187 


Minus 


104773-105051 


405B56 


7653009 


Plus 


101777-102043 


405876 


6758747 


Plus 


39694-40031 


405932 


7767812 


Minus 


123525-123713 


405934 


6758795 


Plus 


159913*160605 


406006 


8247801 


Minus 


42640-42776 


406134 


9163473 


Plus 


153291*153452 


406189 


7289992 


Minus 


22007-22234 


406422 


9256411 


Plus 


163003-163311 


406516 


7711422 


Minus 


128375-128449,128560-128784 


406538 


7711478 


Plus 


35196-35367,38229-38476,40080-40216,43522-43840 


406554 


7711566 


Plus 


106956-107121 


406577 


7711730 


Plus 


11377*11509 
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TABLE 10: shows genes, including expression sequence tags differentially expressed in 
taxol resistant prostate tumor xenografts as compared to taxol sensitive prostate tumor 
xenografts. The genes are indicated as either being upregulated or downregulated during the 
induction of taxol resistance in sequential passages of the grafts. 



Pkey. Unique Eos probeset identifier number 

ExAocn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Tills: Unigene gene title 

Eos: Internal Eos name 

F0CKF14- passage number 



Pkey 


ExAocn 


UnigenelD UnigenTftie Eos Resp.RX) 


F00 


F02 


F02 


F05 


F05 


F07 


F09 


F10 


F11 


F13 


F14 


117921 


N51002 


Hs.47170 UprinA2 PM28UP 1 


9 


8 


9 


32 


20 


34 


122 


105 


82 


71 


111 


112971 


T17185 


Hs.4299 ESTs CHA1 down 290 


281 


267 


335 


270 


284 


150 


157 


83 


89 


49 


75 


126645 


AI167942 


Hs.61635 STEAP PAA5down106 


111 


103 


71 


34 


67 


33 


14 


2 


1 


1 


1 


119018 


N95796 


Hs.179809 ESTs PAB2 down 765 


841 


757 


909 


742 


704 


47B 


42B 


253 


175 


228 


238 


110844 


N31952 


Hs.167531 ESTs PAW down 175 


192 


147 


141 


123 


129 


73 


65 


55 


48 


54 


84 


100654 


HG2841-HT2969 Hs.75442 Albumin, A PM01 down 666 


605 


504 


728 


357 


445 


602 


187 


117 


127 117 113 


100655 


HG2641-HT2970 Hs.75442 Albumin, A PM02down 620 


653 


466 


688 


368 


386 


606 


175 


101 


95 115 97 


102076 


U09579 


HS252437 cyclirwJep PM03down 101 


94 


143 


190 


105 


107 


88 


40 


34 


31 


46 


22 


102208 


U22961 


Hs.75442 albumin PM04down495 


424 


323 


518 


252 


296 


467 


188 


169 


143 


165 


145 


103739 


M075779 


mitochondr PMOSdown 75 


190 


606 


230 


378 


106 


218 


88 


69 


192 


69 


99 


107036 


AA599690 


Hs.15725 SBBI48 PM06down67 


124 


115 


168 


132 


111 


66 


71 


49 


70 


38 


50 


108242 


AA062746 


ESTs PM07down 14 


20 


252 


13 


22 


43 


193 


10 


10 


104 


21 


18 


108282 


AA065143 


solute car PMQ8down 27 


54 


178 


73 


108 


37 


53 


24 


14 


53 


15 


34 


108679 


AA1 15963 


beta-1-glo PM09down680 


693 


1292 656 


869 


389 


1 


74 


118 


662 


359 


409 


108731 


AA126313 


Hs.107476 ATPsynthaPMIOdowntO 


19 


185 


25 


60 


1 


32 


3 


7 


14 


1 


1 


110675 


H89355 


Hs.6598 adrenergic PM11 down 207 


334 


237 


239 


231 


220 


119 


145 


93 


64 


56 


124 


115412 


AA283804 


Hs.193552 ESTs PM12down 148 


316 


282 


271 


340 


334 


115 


238 


100 


196 


83 


207 


115844 


AA430124 


Hs.234607 MDM2 PM13down49 


93 


94 


154 


132 


91 


23 


54 


23 


76 


14 


41 


120588 


AA281591 


Hs.16193 ESTs PM14down80 


157 


58 


141 


159 


127 


39 


83 


35 


37 


16 


46 


132349 


Y00705 


Hs.181286 serine pro PM15down146 


217 


214 


150 


106 


128 


177 


85 


54 


63 


66 


56 


132888 


AM90775 


Hs.5920 N-acetyima PM16down 92 


150 


132 


178 


126 


139 


53 


94 


48 


67 


41 


80 


132967 


AA032221 


Hs.61635 STEAP PM17down224 


208 


203 


215 


205 


180 


132 


65 


68 


50 


48 


63 


133063 


AA283085 


Hs.64065 ESTs PM18downB5 


148 


161 


150 


92 


108 


42 


99 


42 


65 


29 


126 


134374 


D62633 


Hs.8236 ESTs PM19down230 


240 


194 


212 


231 


189 


89 


123 


107 


95 


68 


91 


135400 


M23263 


Hs.99915 androgen r PM20down 36 


167 


99 


178 


132 


101 


23 


71 


26 


122 


14 


44 
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TABLE 1 1 : shows genes, including expression sequence tags that are up-regulated in 
prostate tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos 
HuOl GeneChip array. Shown are the ratios of "average" normal prostate to "average" 
prostate cancer tissues. 



Pkey: Unique Eos probeset identifier number 

ExAocn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: Background subtracted normal prostate : prostate tumor tissue 



Pkey 


ExAccn 


UnigenelD 


101336 


L49169 


Hs.75678 


130642 


M 63438 


Hs.156110 


133512 


X01677 


Hs.195188 


133436 


H44631 


Hs.737 


129292 


X13810 


Hs.1101 


100610 


HG2566-HT4792 




133448 


M34516 


Hs.170116 


125193 


W67577 


Hs.84298 


133456 


T49257 


Hs.183704 


134546 


AA459310 


Hs.8518 


102131 


U15085 


Hs.1162 


101375 


M13560 


Hs.64298 


100674 


HG3033-HT3194 




134365 


R32377 


Hs.82240 


132335 


D60387 


Hs.189885 


110303 


H37901 


Hs.32706 


131678 


N59162 


Hs.30542 


116599 


D80046 


Hs.250879 


133769 


M17733 


Hs.75968 


107904 


AA026648 


Hs.61389 


129427 


T80746 


Hs.11 1334 


105987 


AA406631 


Hs.1 10299 


131466 


F03233 


HS27189 


102859 


X00274 


Hs.76807 


134626 


S82198 


Hs.8709 


134170 


M63138 


Hs.79572 


131713 


X57809 


Hs.181125 


100748 


HG3517-HT3711 




118769 


N74496 




111734 


R25375 


Hs.126916 


109221 


AA192755 


Hs.85840 


133846 


M480073 


Hs.76719 


135281 


AA401575 


Hs.97757 


119073 


R32894 


Hs.45514 


100760 


HG3576-HT3779 




101426 


M19483 


Hs.25 


129568 


AA428025 


Hs.1 14360 


130900 


Z38468 


Hs.21036 


133879 


M 13829 


Hs.77183 


100627 


HG2702-HT2798 




129424 


M55593 


Hs.1 11301 


128652 


AA621245 


Hs.103147 


129979 


T72635 


Hs.1 3956 


133468 


X03068 


Hs.73931 


102636 


U67092 




129536 


M33493 


Hs.184504 


133599 


M64788 


Hs.75151 



Unigene Title 

FBJ murine osteosarcoma viral oncogene homo log B 

Immunoglobulin kappa variable 1D-8 

gfycera!dehyde~3-phosphate dehydrogenase 

immediate early protein 

POU domain; class 2; transcription factor 2 

Microtubule-Associated Protein Tau, All Spliced, Exon 8 

immunoglobulin lambda-like polypeptide 3 

CD74 antigen (invariant polypeptide of major histocompatibility 

complex; class II antigen-associated) 

ubiquitin C 

Homo sapiens mRNA; cDNA DKFZp586L1722 (from clone 
DKFZp586L1722) 

major histocompatibility complex; class II; DM beta 

CD74 antigen (invariant polypeptide of major histocompatibility 

complex; class II antigen-associated) 

SpRceosomal Protein Sap 62 

syntaxin 3A 

ESTs 

ESTs 

ESTs 

ESTs 

thymosin; beta 4; X chromosome 
ESTs 

ferritin; light polypeptide 
mitogen-activated protein kinase kinase 7 
ESTs 

Human HLA-DR alpha-chain mRNA 

catdecrin (serum calcium decreasing factor, elastase IV) 

cathepsin D (lysosomal aspartyt protease) 

immunoglobulin lambda gene cluster 

Alpha- 1 -Antitrypsin, 5' End 

ESTs 

ESTs 

ESTs; Weakly similar to stac [Rsapiens] 
U6 snRNA-assoctated Sm-Iike protein 
ESTs 

v-ets avian erythroblastosis virus E26 oncogene related 

Major Histocompatibility Complex, Class ti Beta W52 

ATP synthase; H+ transprtng; mttochndri F1 complex; beta pofypept 

transforming growth factor beta-stimulated protein TSC-22 

ESTs; Moderately similar to F25965.3 [H.sapiens] 

v-raf murine sarcoma 3611 viral oncogene hornotog 1 

Serine/Threonine Kinase (Gb225424) 

matrix metalloproteinase 2 (gelatinase A; 72kD getatinase; 

72kD type IV collagenase) 

ESTs; Weakly similar to similar to SP:YR40_BACSU [Cetegans] 
ESTs 

major histocompalMity complex; class (I; OQ beta 1 
Human ataxia-telangiectasia locus protein (ATM) gene, exons 
1a. 1b, 2, 3 and 4, partial cds 
tryptase; alpha 

RAP1; GTPase activafing protein 1 
194 



R1 

0.012 

0.015 

0.017 

0.017 

0.019 

0.02 

0.021 

0.022 
0.022 

0.023 
0.023 

0.023 

0.024 

0.027 

0.027 

0.028 

0.028 

0.029 

0.029 

0.03 

0.03 

0.03 

0.032 

0.032 

0.032 

0.033 

0.034 

0.034 

0.034 

0.036 

0.036 

0.036 

0.037 

0.037 

0.037 

0.038 

0.038 

0.039 

0.039 

0.039 

0.039 
0.039 
0.039 
0.04 

0.04 
0.04 
0.041 
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102104 


U12139 


Hs.25817 


Human atphal (XI) collagen (COL1 1 A1) gene, 5' region and exon 1 


0.041 


131340 


AA47B305 


Homo sapiens chromosome 19; cosmid R27216 


0.041 


130446 


X79510 


Hs.155693 


protein tyrosine phosphatase; non-receptor type 21 


0.042 


101352 


L77701 


Hs.16297 


COX17 (yeast) homolog; cytochrome c oxidase assembly protein 


0.042 


122593 


AA453310 


Hs.128749 


alpha-methytacyl-CoA race ma se 


0.042 


130181 


R39552 


Hs.151608 


Homo sapiens clone 23622 mRNA sequence 


0.042 


134071 


Z14093 


Hs.78950 


branched chain keto acid dehydrogenase El; alpha polypeptide 
(maple syrup urine disease) 


0.042 


108129 


AA053252 


Hs.185848 


ESTs; Weakly similar to !! ALU SUBFAMILY J WARNING 
ENTRY 1! [H^apiens] 


0.043 


130511 


L32137 


Hs.1584 


cartilage oKgomeric matrix protein (pseudoachondroptasia; 
epiphyseal dysplasia 1 ; multiple) 


0.043 


133336 


AA291456 


Hs.71190 


ESTs 


0.043 


132982 


LQ2326 


Hs.198118 


immunoglobulin lambda-like polypeptide 2 


0.044 


131880 


AA047034 


Hs.33818 


RecQ protein-like 5 


0.044 


130540 


U35234 


Hs.159534 


protein tyrosine phosphatase; receptor type; S 


0.044 


133467 


AA258595 


Hs.73931 


major histocompatibility complex; class II; DQ beta 1 


0.044 


101191 


L20688 


Hs.83656 


Rho GDP dissociation inhibitor (GDI) beta 


0.044 


101860 


M95610 


Hs.37165 


collagen; type IX; alpha 2 


0.044 


102799 


U88898 




Human endogenous retroviral H protease/integrase-derived ORF1 
mRNA, complete cds, and putative envelope prot mRNA, partial cds 


0.044 


107200 


D20350 


Hs.5628 


ESTs 


0.044 


101166 


L14927 


HS2099 


lipocalin 1 (protein migrating taster than albumin; tear prealbumin) 


0.044 


134289 


. M54915 


Hs.81170 


pinvl onxgene 


0.044 


135329 


AA436026 


Hs.98858 


ESTs 


0.044 


124950 


T03786 


Hs.151531 


protein phosphatase 3 (formerly 2B); catalytic subunit; beta isoform 
(cala'neurin A beta) 


0.044 


102919 


X12447 


Hs.183760 


aldolase A; froctose-bisphosphate 


0.044 


100574 


HG2279-HT2375 




Triphosphate Isomerase 


0.045 


131286 


AA450092 


Hs.25300 


Homo sapiens clones 24718 and 24625 mRNA sequence 


0.045 


102675 


U72512 




Human B-cell receptor associated protein (hBAP) alternatively 
spliced mRNA, partial 3'UTR 


0.045 


131332 


R50487 


Hs^5717 


ESTs 


0.045 


101634 


M57731 


Hs.75765 


GR02 oncogene 


0.046 


113118 


T47906 


Hs.220512 


ESTs 


0.046 


124884 


R77276 


Hs.120911 


ESTs 


0.046 


130523 


W76097 


Hs.214507 


ESTs 


> 0.046 


110244 


H26742 


Hs.25367 


ESTs; Weakly similar to ALR (H.sapiens] 


0.046 


131932 


AA454980 


Hs.25601 


chromodomain helicase DNA binding protein 3 


• 0.046 


132509 


H09751 


Hs.5038 


neuropathy target esterase 


0.046 


133372 


AA291139 


Hs.72242 


ESTs 


0.046 


100817 


HG4011-HT4804 




Dystrophin-Associated Glycoprotein, 50 Kda, AIL SpDce 2 


0.047 


106746 


AA476436 


Hs.7991 


ESTs 


0.047 


135401 


L14813 


Hs.169271 


carboxyi ester Bpase-Cke (bile salt-stimulated fipase-like) 


0.047 


130479 


B44163 


Hs.12457 


Homo sapiens clone 23770 mRNA sequence 


0.047 


102589 


U62015 


Hs.8867 


cysteine-rich; angiogenic inducer, 61 


0.047 


121521 


AA412165 


Hs.97358 


EST 


0.048 


135340 


AA425137 


Hs.99093 


Homo sapiens chromosome 19; cosmid R28379 


0.048 


132336 


AA342422 


Hs.45073 


ESTs 


0.048 


115368 


AA282133 


Hs.88960 


ESTs; Weakly similar to similar to collagen [Celegans] 


0.048 


101278 


L38487 


Hs.1 10849 


estrogen-related receptor alpha 


0.048 


103284 


X6Q200 


Hs.8375 


TNF receptor-associated (actor 4 


0.048 


100564 


HG2239-HT2324 




Potassium Channel Proteirv(Gb211585) 


0.048 


133132 


Z40883 


Hs.65588 


ESTs; Weakly similar to dJ393P122 [RsapiensJ 


0.048 


121811 


AA424535 


Hs.98416 


ESTs 


0.048 


129613 


AA279481 


Hs.238831 


ESTs; Weakly similar to collagen alpha 1(XVill) chain (M.musculus} 


0.049 


132468 


S79854 


Hs.49322 


deiodinase; lodothyronine; type III 


0.049 


120111 


W95841 


Hs.136031 


ESTs 


0.049 


103668 


Z83741 


Hs248174 


H2A histone family; member M 


0.049 


130386 


F10874 


Hs^34249 


mitogen-actrvated protein kinase 8 interacting protein 1 


0.049 


104275 


C02170 


Hs.39387 


ESTs; Weakly smlr to weak smlrity to ribosomal prot L14 [Celegans] 


0.049 


106305 


AA436146 


Hs.12828 


ESTs 


0.05 


116431 


AA609878 


Hs.55289 


ESTs; WeaWy smlr to 1 10 KD CELL MEMBRANE GLYCOPROTEIN [H.sapiens] 0.813 


120339 


AA206465 


H&256470 


EST 


0.05 


114427 


AA017063 




ESTs; Highly similar to Miz-1 protein [H.saplens] 


0.05 


118821 


N79070 


Ks.94789 


ESTs 


0.05 


118979 


N93798 


Hs.43666 


protein tyrosine phosphatase type IVA; member 3 


0.05 


107495 


W78776 


Hs.90375 


ESTs 


0.051 


120240 


Z41732 


Hs.66049 


ESTs 


0.051 
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114331 
130947 
129242 
131413 
112304 
101416 
131201 
101054 
101306 
129311 

129942 
119210 
101046 
114086 
110171 
101004 
129715 
101581 
113285 
127537 
100813 
101841 
135053 
101419 
119724 
102673 
129877 
114788 
123812 
117669 
123782 
102395 
133795 
123193 
132595 
104161 
115330 
112893 
133475 



102940 
131299 
102495 
129594 
118593 
126702 



130538 
114299 
115604 
106052 
131730 
131265 
129705 
123175 
103592 
118196 

104886 
104250 

113301 
110441 
125297 
135258 
130633 
112006 



Z41309 

R40037 

W81679 

AA482390 

R54798 

M17254 

AA426304 

K02405 

L41143 

T55087 

U95301 

R93340 

KOI 160 

Z38266 

H19964 

J04101 

N58479 

M34996 

T66B30 

AA569531 

HG3995-HT4265 

M93107 

R77159 

M17886 

W69468 

U72509 

AA248589 

AA156737 

AA620607 

N39237 

AA610111 

U41767 

M12529 

AA489228 

AA253369 

AA456471 

AA281145 

T08000 

129217 

K03207 

X13956 

AA431464 

U51240 

R70379 

N69Q20 

U54602 

N27368 

M20786 

Z40782 

AA400378 

AA416947 

U05681 

AA479498 

X78706 

AA489010 

Z30644 

N59478 

AA053348 
AF000575 

T67452 

H50302 

Z39215 

AA292423 

T92363 

R42607 



Hs.12400 

Hs£1506 

Hs.5174 

Hs£6510 

Hs.26239 

Hs.45514 

Hs.24174 

Hs.73933 

Hs.232069 



Hs.144442 
Hs.92995 

Hs.12770 

Hs.31709 

HS248109 

Hs.12126 

Hs.198253 

Hs.182712 

Hs.162859 

HS76893 
Hs.93678 
Hs.177592 
Hs.47622 

Hs.13094 

Hs.103904 

Hs.111591 

Hs.44977 

Hs.162695 

Hs.92208 

Hs.169401 

Hs.136956 

Hs.155742 

Hs.7724 

Hs.88827 

Hs.194684 

Hs.73987 

Hs.103972 

Hs.24998 

Hs.25426 

Hs.79356 

Hs.115396 

Hs.207689 

Hs.2785 

Hs.212414 

Hs.159509 

Hs.22920 

Hs.49391 

Hs.6382 

Hs.31210 

HS25274 

Hs.12068 

Hs.178400 

Hs.123059 

Hs.48396 

Hs.144626 
Hs.105928 

Hs.13104 

Hs.19845 

Hs.159409 

Hs.97272 

Hs.178703 

Hs.22241 



ESTs 0.051 

ESTs 0.052 

ribosomal protein S17 0.052 

ESTs; Modty smlr to vacuolar pro! sorting homolog r-vps33b [R.norvegicus) 0.052 

ESTs 0.052 

v-ets avian erythroblastosis virus E26 oncogene related 0.052 

ESTs 0.052 

Human MHC class II HU-DQ-beta mRNA (DR7 DQw2); complete cds 0.052 

T-ceil leukemia translocation altered gene 0.053 

yb45c08Jl Stratagene fetal spleen (#937205) Homo sapiens cDNA 

clone IMAGE:74126 5', mRNA sequence. 0.053 

phosphotipase A2; group X 0.053 

ESTs 0.053 

Accession not listed in Genbank 0.053 

Homo sapiens PAC clone DJ0777O23 from 7p14-p1 5 0.053 

ESTs 0.053 

v-ets avian erythroblastosis virus E26 oncogene homotog 1 0.053 

ESTs; Weakly similar to LR8 [Rsapiens] 0.053 

major histocompatibility complex; class 11; DQ alpha 1 0.053 

ESTs 0.053 

ESTs 0.054 

Cpg-Enriched Dna, Clone S19 0.054 

3-hydroxybutyrate dehydrogenase (heart; mitochondrial) 0.054 

ESTs 0.054 

ribosomal protein; large; PI 0.054 

ESTs 0.055 

Human alternatively spliced B8 (B7) mRNA, partial sequence 0.055 

ESTs; Weakly similar to ORF YGR101 w [S.cerevisiae] 0.055 

EST 0.055 

ESTs 0.055 

ESTs 0.055 

EST 0.055 

a disintegrin and metaRoproteinase domain 15 (metargidin) 0.055 

apoBpoprotein E 0.055 

ESTs 0.056 

gtyoxytate reductase/hydroxypyruvate reductase 0.056 

KIAA0963 protein 0.056 

ESTs 0.056 

bassoon (presynaptic cytomatrix protein) 0.056 

CDC-Oke kinase 3 0.056 

rjjofine-rich protein BstNl subfamily 4 0.056 

Hu 12S RNA induced by pofy(ri); pofy(rC) and Newcastle disease virus 0.056 

ESTs; Weakly similar to unknown [Ksapiens] 0.057 

Lysosomal-associated multispanning membrane protein-5 0.057 

Human germline IgD chain gene; C-region; C-delta-1 domain 0.057 

EST 0.057 

keratin 17 0.057 
sema domain; immunoglobulin domain (Ig); short basic domain; 

secreted; (semaphorin) 3E 0.057 

a!pha-2-plasmin inhibitor 0.057 

similar to S68401 (cattle) glucose induced gene 0.057 

ESTs 0.057 

ESTs; Highly similar to KIAA0612 protein |H.sapiens] 0.057 

B-ceUCLUIymphoma3 0.057 

ESTs; Modty smlr to putative seven pass transmembrane prot [H.sapiens] 0.058 

carnitine acetyttransferase 0.058 

ESTs 0.058 

chloride channel Kb 0.058 
ESTs; Moderately similar to tumor necrosis factor-alpha 

-Induced protein B12 (H.sapiens] 0.058 

growth differentiation factor 1 1 0.058 
leukocyte immunoglobuDn-Uke receptor; subfamily B (withTM 

and ITIM domains); member 3 0.058 

EST 0.058 

ESTs; Highly smlr to prot phosphatase 2A BR gamma subun'rt (H.sapiens) 0.058 

ESTs 0.058 

ESTs; Weakly simaar to dJ281H82 (RsapiensJ 0.058 

ESTs 0.058 

hypothetical protein 0.058 
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130805 
134907 
132619 
135115 
100531 
124530 
119960 
132793 
101076 
130655 
134458 
105904 
132878 
121828 
133418 
129317 
130153 
124403 
127683 
1298H 
131770 
117557 
103522 
120029 
102135 
123617 
112136 
133725 
102069 
106555 
123269 
109088 
129399 
129375 
135271 



129364 
123427 
105236 
101012 
134791 
133700 
123887 
129363 
105719 
124226 
117437 

132741 
134437 
107664 
120844 
101574 
131219 
103495 
129607 
106467 
128841 
100515 
119332 
134516 
135012 
103575 
115514 

103996 
110505 
133912 
129581 



U121S4 
D80002 
AA404565 
N35489 

HG1872-HT1907 

N62256 

W87533 

AA478999 

L04270 

N92934 

AA192614 

AA401452 

AA026793 

AA425166 

U76366 

N46244 

D85815 

N31745 

AA668123 

W20070 



Y10514 

W91960 

U15460 

AA609183 

R46100 

V00563 

U09196 

AA455000 

AA491226 

M166837 

AA263028 

W79850 

AA397763 

W90398 

AA477106 

AA598548 

AA219179 

J04444 

L18983 

K01396 

AA621065 

H05704 

AA291644 

H62396 

N27645 

AA394133 

M26041 

AA010594 

AA349417 

M34182 

C00476 

Y09022 

AA404594 

AA450040 

T16358 

HG1723-HT1729 

T54095 

AA171939 

X73608 

Z26256 

AA297739 

AA321355 
H55992 
X62744 
M33600 



Hs.170238 
Hs.178292 
Hs^3447 
Hs.94653 

Hs.102727 

Hs.32699 

Hs.56966 

HS.1116 

Hs.17409 

Hs.83577 

Hs.32060 

Hs.58679 

Hs.98497 

Hs.172727 

Hs.1 10373 

Hs.15114 

Hs.102493 

Hs.134170 

Hs.168625 

Hs.31833 

Hs.44532 

Hs.250640 

Hs.41691 

Hs.181131 

Hs.9739 

Hs.179543 

Hs.82520 

Hs.16725 

Hs.105280 

HS72620 

Hs.111076 

Hs.1 1081 

Hs.97562 

Hs.6147 

Hs.1 10757 

Hs.1 12471 

Hs.19105 

Hs.697 

Hs.89655 

Hs.75621 

Hs.1 12943 

Hs.110746 

Hs.36793 

Hs.190266 



HS55898 

Hs.198253 

Hs£326 

Hs.96917 

Hs.158029 

Hs.24395 

Hs.153591 

Hs.1 1607 

Hs.154162 

Hs.106443 



Hs*3413 
Hs.93029 

Hs.55609 



Hs.20495 
Hs.77522 
Hs.180255 



sodium channel; voftage-gated; type I; beta polypeptide 0.058 

WAA01 80 protein 0.058 

ESTs; Moderately similar to kinesin Dght chain 1 [M jnuscutus) 0.058 

neurochondrin 0.058 

Major Histocompatibility Complex, Dg 0.058 

EST 0.058 

ESTs; Moderately simirar to LIV-1 protein (H^apiens] 0.058 

KIAA0906 protein 0.058 

tymphotoxin beta receptor (TNFR superfamfly; member 3 0.058 

cysteine-rich protein 1 (intestinal) 0.056 

cysteine and glydne-rich protein 3 (cardiac LIM protein) 0.058 

ESTs " 0.059 

ESTs; Weakly similar to 4F2/CD98 light chain (M.musculus] 0.059 

ESTs 0.059 

Treacher CoIRns-Franceschetti syndrome 1 0.059 

ESTs 0.059 

ras homolog gene family; member D 0.059 

ESTs 0.059 

ESTs 0.059 

KIAA0979 protein 0.059 

ESTs 0.06 

diubiquitin 0.06 

H^apiensmRNA for C0152 protein * 0.06 

sequence-specific single-stranded-DNA-binding protein 0.06 

activating transcription factor 6 0.06 

ESTs 0.06 

ESTs 0.061 

immunoglobulin mu 0.061 

Hu 1.1 kb mRNA upregrtd in retinoic acid treated HL-60 neutroph3ic cells 0.061 

ESTs 0.061 

ESTs; Weakly similar to d)963K232 {H.sapiens] 0.061 

DKFZP434I1 14 protein 0.061 

malate dehydrogenase 2; NAD (mitochondrial) 0.061 

ESTs; WeaWy similar to HPBRlt-7 protein [H.sapiens] 0.061 

ESTs 0.061 
KIAA1075 protein ' 0.061 

DNA segment on chromosome 21 (unique) 2056 expressed sequence 0.061 

ESTs 0.061 

transtocase of inner mitochondrial membrane 1 7 (yeast) homolog 6 0.061 

cytochrome c-1 0.062 

protein tyrosine phosphatase; receptor type; N 0.062 

protease Inhibitor 1 (antt-elastase); alpha-1 -antitrypsin 0.062 

ESTs 0.062 

H sapiens HCR (a-helix coiled-coil rod homologue) mRNA; complete cds 0.062 

ESTs 0.062 

ESTs 0.062 
yw5e3.s1 Weizmann Olfactory Epithelium H sapiens cDNA done 

IMAGE255676 J smlr to contains L1.t3 L1 repetitive element ; mRNA seq 0.062 

ESTs; Highly simaar to OASIS protein [M.muscufus] 0.062 

major histocompatibffiry complex; class U; DQ alpha 1 0.062 

ESTs; Moderately similar to pim-1 protein [H^apiens] 0.062 

ESTs 0.062 

protein kinase; cAMP-dependent; catalytic; gamma 0.062 

smaB inductole cytokine subfamily B (Cys-X-Cys); member 14 (BRAK) 0.062 

Not56 (D. melanogaster)-like protein 0.062 

ESTs 0.062 

ADP-rfbosylation factor-like 2 0.062 

ESTs 0.062 

Macrophage Scavenger Receptor, AIL Splice 2 0.062 
ESTs; Weakly similar to II ALU SUBFAMILY J WARNING ENTRY II [H-sapiensJ 0.062 

ESTs 0.062 

sparc/osteonectin; cwcv and kazaHike domains proteoglycan (testican) 0.063 

H^apiens isoform 1 gene for L-type calcium channel, exon 1 0.063 
ESTs; Weakly similar to ISOLEUCYL-TRNA SYNTHETASE; 

CYTOPLASMIC (H^apiensJ 0.063 

EST2393 Bone marrow Homo sapiens cONA 5' end, mRNA sequence 0.063 

DKFZP434F011 protein 0.063 

major histocompatfoiGty complex; class II; DM alpha 0.063 

major histocompatibility complex; class it; DR beta 1 0.063 
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130139 


R3&280 


Hs.1 50922 


106817 


AA397825 


Hs.5307 


134658 


AA410617 


Hs 178009 

1 U» I * www 


100306 


D50495 


Hs.80598 


100277 


D42053 


Hs.75890 


IOO 1 ID 


D61259 


Hs.6529 


134909 


AA521488 


Hs.90998 


130319 


X74794 


Hs.154443 


132057 


AA 102469 


Hs 173484 

I IO. 1 f U7V1 


1 UOOOH 


AA070473 




1007R3 
l COl oo 


F10815 


Hs 12373 


IOO 1 \C 


T67464 


Hs.94617 




AA436856 


Hs 98910 


IOOw£ 


AA457129 


Hs.6455 


1 IO&IO 


T5B6/T7 








Hs 17719 
rid. 
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He 947TA9 
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He 7R7R1 
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i<:oyyi 


D0 1 CCO 

no loot 


He A91 


100.CQO 

luyooo 


CAOO00 

rvcocc 


He Ofiioj; 
nsxo 1 oo 


1100X1 

1 lsfc41 


1 nooy 


He 901 1fl9 


looooy 


MM lOOOa/ 


He 9RR441 


112926 


T10316 


HS.4302 


120495 


AA256073 


Hs.190626 


louyol 


A A07Q/10 


He 0111R 

nsx 1 040 


1^9982 


M87789 


He 1/rt 

ns.i4U 


133832 


UAOOOT 

H033o7 


Up 0^1 one 
nSji41oU0 


110697 


n93721 


nS.tU/yo 


121183 


AMuOloo 


He 077/W 


130953 


U12707 


U« 01C7 


lUcZld 


Ut4 IOO 


na.f oidu 


114181 


Z39079 


Hs.8021 


116581 


D51287 


Hs.82148 


132498 


T87708 


HS50098 


103788 


AA096014 


Hs.9527 


102459 


U48936 




100373 


D79999 


Hs.77225 


132717 


AA203321 


Hs.151696 


128863 


D87462 


Hs.106674 


115193 


AA262Q29 


Hs.88218 


124558 


N 66046 


Hs.141605 


117225 


N20392 


Hs.42846 


110665 


H83380 


Hs.32757 



BCS1 {yeast homotogHite 0.064 

synaptopodin 0.064 

ESTs 0X)64 

transoiplion elongation (actor A (Sfl); 2 0.064 
stta-1 protease (subCIisin-fike; sterok egulaled; cleaves sterol regulatory 

element binding proteins) 0.064 

ESTs 0.064 

KIAA0128 protein 0.064 

minicbromosome maintenance deficient (S. cerevisiae) 4 0.064 

ESTs 0.064 
zm7c8.s1 Stratagene neuroepithelium (#937231) Homo sapiens cDNA 

done IMAGE-5399 3, mRNA sequence 0.064 

KtAA0422 protein 0.064 . 

ESTs; Weakly similar to predicted using Genefinder [CetegansJ 0.064 

ESTs 0.064 

RuvB(E art homology 2 0.064 
ya94a02^1 Stratagene placenta (#937225) Homo sapiens cDNA done 

IMAGED 9290 3", mRNA sequence. 0.065 

ESTs 0.065 

VGF nerve growth factor inducible 0.065 

phospholpase A2; group IVC (cytosolic; caldum-independent) 0D65 

H^apiens DAT1 gene, partial, VNTR 0.065 

RgaseHI;DNA;ATP-dependent ^ 0.065 

ESTs; Highly sim2ar to CGI-69 protein [H.sapiensl 0.065 

IMP (inosine monophosphate) dehydrogenase 1 0.065 

ESTs 0.065 

ESTs 0.065 

steroidogenic acute regulatory protein related 0.065 

ESTs 0.065 

ESTs; Highly similar to CGI-38 protein [Haptens] 0.065 

ESTs; Weakly similar to mitogen inducible gene mig-2 [Rsaplens] 0.065 

ESTs; Weakly similar to T20B12.3 [Celegans] 0.065 

ESTs 0.065 

ESTs 0.065 

ESTs 0.065 

ESTs 0.066 

Homo sapiens unknown protein mRNA, partial cds 0.066 

prepronocrceptin 0.066 

TYRO protein tyrosine kinase binding protein 0.066 

EST 0.066 

dimethyiarginine dimethytaminohydrolase 2 0.066 

serine dehydratase 0.066 

Human gamma-aminobutyric add transaminase mRNA, partial cds 0.067 

btglycan 0.067 

ESTs 0.067 

ESTs 0.067 

EST; Moderately sfrtfar to CGM36 protein [Rsapiens] 0.067 

ESTs 0.067 

ESTs 0.067 

ESTs; Weakly similar to F42C5.7 gene product (Celegans] . 0.067 

immunoglobulin gamma 3 (Gm marker) 0.067 

estrogen-responsive B box protein 0.067 

ESTs - 0.067 

ESTs 0.067 

Wiskott-Aldrich syndrome (ecezema-thrombocytopenia) 0.067 

phosphofructoMnase; muscle 0.067 

K1AA1058 protein 0.067 

rfoosomal protein S1 2 0.067 

ESTs 0.068 

ESTs; Highly similar to HSPC01 3 (Ksapiens) 0.068 
Human amSoride-sensitive epithelial sodium channel gamma subunit mRNA, 

5* end, partial cds 0.068 

ADP-rtoosyttransferase (NAD+; poly (ADP-ribose) polymerase)* 1 0.068 

DKFZP727G051 protein 0.068 

BRCA1 associated protein-1 (ublquitin carboxy-terminal hydrolase) 0.068 

ESTs 0.068 

ESTs 0.069 

ESTs 0.069 

ESTs 0.069 
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1170663 


HsJB2965 

i to. iulwvm 


KmppjeHiks factor 4 (gut) 


105778 


AA348910 


Hs.1 53299 


D0M-3 (C. etegans) hemolog Z 


l\r*l i\J 


R72079 




C079B antigen ftrnmunoglobuCn-assodated beta) 


123097 


AA485B69 


Hs.105671 


ESTs 




HG3523-HT4899 




Proto-Oncogene C-Myc, AIL Splice 3, Off 1 14 


125091 


T91518 




ye20f05.s1 Straiagene lung (#937210) H sapiens cONA done IMAGE: 

3* similar to contains Alu repetitive etement;contains MER12 repetitive Gfement; 

mRNA sequence. 


100756 


HG3565-HT3768 




Zinc Finger Protein (Gb:M88357) 


113483 


187768 


Hs.16439 


ESTs 


mi no 

lUl 1 \o 






mmntarrv>nl nrimtwiRnt 9 


102286 


U31628 


Hs.12503 


interteukin 15 receptor; alpha 


1 S>OHJ 


nwi7d 

UOwl tH 


He 0930 

na.9 SOU 


mlbnfin-hmdino nrntRin 2 fcoflinsn 2) 


100991 


J03764 


Hs.82085 


plasminogen activator inhfcitor; type 1 


133675 


AA443720 


Hs.7551 


ESTs; WeaWy similar to T25G3.1 [CelegansJ 


105422 


AA251014 

rVTtsJ 'wit 


Hs.12210 


ESTs 


102932 


X13334 


Hs.75627 


CD14 antigen 


1 1914/ 


nooo/o 


He R5739 








H« 1R0481 


ESTs; Weakly similar to ACR0SIN PRECURSOR [H.sapiens] 


1 JO 1 OD 


A AAA 1404 


Hs.6686 


ESTs 


1 l3*iaO 


AA290674 


Hs.71819 


eukaryotic translation initiation factor 4E binding protein 1 


101IYIG 
19AARQ 


AA3QR339 

RfiQflflft 
nouuoo 


HeQ7fi13 


co I a 

ESTs- Weaklv similar to F55A12 9 IC eleoansl ^ 




rotoo/o 


He 108969 


mannastdase' alaha* dass 2B* member 1 


1191M 
lie ID 1 


n*tO£93 




ESTs; Wkly smlr to » ALU SUBFAMILY J WARNING ENTRY !! (H^apiens) 


120201 


W0/4O0 


He 1A1A&A 
no. i*» 


ESTs 


1 04&30 


myyiir 

OUl/l ID 


He 81343 


collagen; type II; alpha 1 (primary osteoarthritis; spondyloepiphyseal 


119/40 


W/V404 


Me KR0Q3 


ESTs 


131306 


AA2o20oO 


n 5.20409 


CO IS 


iVf ff\> 


A AMRR90 
MnulOO£U 


He 991147 
nox£ i i hi 


ESTs 


1042/1 


MM 199DOU 


He IftAARR 
na. IO*f*00 


PST<i- Wklv nmlr to » ALU SUBFAMILY SX WARNING ENTRY !! IH^aoiensl 








Arrfleeion nnl fietftft in ftnnhank 


10O4U2 


Q7RQ49 
O/094C 


Ho 09999 

na.999££ 


dnnaminfl rnnpntnr Dd 


I lb/42 




He £0494 


per 

CO 1 


131RR7 


NAAfiAfi 
iMOtooo 


Hs.3353 


Homo saDiens dona 24940 mRNA seouenca 

1 IVJitW MOuiOl IO WUP 10 fc*tvTU till 11 " * vU^UOl 


lU2y2d 


Y19C17 


He lORQ 


c/nAft nufkfiflr rihonimlAflDfDtein DolvneDtide C 


1UU//0 


UrVV71-UT9fc3AA 

nuo/ 1 n I cuooo 




Mucin 1 Eoithelial AlLScIice9 


111020 


N54361 


Ue 1RC79R 


CO IS 


4*2/1314 

104224 


AOUOZc 


He 1R35.Q3 

no. I00090 


rihrtenmal nmtprn 1 1Ra 


124U0a 


F13R73 


He QQ769 


ESTs 


loda/2 


AA1KA7/13 


HcTflfllQ 
fl».r0wl9 


Hnmn epninnc; donfl 24432 mRNA seousnoG 

i lUl 1 t\J aopiClia Wvllo t*t*txxu itirunrx «j^udi ilxj 


■100C01 

iZyooi 


A A/CJCfYYJ 
AA4ooUUy 


He 17R1A6 

ns. wo too 


F^Te* Wflakh/ similar tn WA5?P-famitv Drotein rH «^Diansl 


lUOvOO 


YCQQQQ 
A0OO99 


Hs.81221 


Human L2-9 transcriot of unrearranaed immunoalobulin V(H)5 oseudoaene 


124900 


T 10971 
I Ik/ 1 


He I'v'^n 


VOHIOJUf 1 


1122/U 


DC5091 
nOoUZl 


Hc9A33Wl 
ru>xvoooo 


ESTs 


110/04 


C1MR3 


He fifi140 
no.Dv ih\j 


EST 


JZyoinJ 


m 10099 


He 111481 


Mnitnnlaemln ffnrroxiria 4 ^)) 

vOI UlUUlaOIIIU 1 IIDI lUAKJQuu^ 


12/O40 




He 1fiR9*v3 


F^Tq- Hinhtv cimilar tn KIAA0476 Drotoln rH sanlensl 

cold, iHyiiij oiiitiioi iu r\inr\w»»*u piwiaui |i i.oaprai>^j 


112430 




rtS^009 1 


FSTe 
co la 


11/ICQ1 

114031 


AAUOOUOO 


He 9113330 


ESTs 


IOC I'M 

100122 




He QA314 
nS.9*Kl IH 


ESTs 


103934 


A A 9*31990 

AAZolooo 


Ue 1Q49fVl 


Hnmn caninne mRNA* rDMA DKF7nS64C1Efi /frnm dons DKFZd564C186) 


109363 


A AUCJCO 
AA210009 


Ue IfiCTftA 
nS.loO'04 


PCTc* Woalffu cfmHar tn hvnntttetfrsil nmlAfn fH cantanel 
coi S| ivcoNy aiiuuoi io iijfH/uicuucu pimBui in.ao^H'iiaj 


112647 


DQ9990 


nS.o04UJ 


POT*. 

coi a 


1970A3 


244079 


Hs 91608 


otofsriin 


133027 


AA402624 


Hs.63236 


synuclein; gamma (breast cancer-specific protein 1 ) 


122086 


AA432121 


Hs^50986 


EST 


110405 


H47542 


Hs.33962 


ESTs 


128697 


AB002344 


Hs.103915 


KIAA0346 protein 


112221 


R50380 


Hs.25670 


ESTs 


100478 


HG1067-HT1067 




Mucin (Gb:M22406) 


115598 


AA400129 


Hs.65735 


ESTs 


132491 


AA227137 


Hs.4984 


K1AA0328 protein 


101655 


M60299 




Human alpha- 1 collagen type II gene, exons 1, 2 and 3 


106018 


AA411887 


Hs.34737 


ESTs 


129683 


W05348 


Hs.158196 


DKFZP434B103 proton 


134137 


F10045 


Hs.79347 


KIAA0211 gene product 


114008 


W89128 


Hs.19872 


ESTs 



0.069 
0.069 
0.069 
0.069 



0.069 

0.069 

0.069 

0.069 

0.07 

0.07 

0.07 

0.07 

0.07 

0.07 

0XT7 

0.07 

0.07 

0.07 

0.07 

0.071 

0.071 

0.071 

0.071 

0.071 
0.071 
0.071 
0.071 
0.071 
0.071 
0.071 
0.071 
0.071 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.072 
0.073 
0.073 
0.073 
0.073 
0.073 
0.073 
0.073 
0.073 
0.073 
0.073 
0.073 
0.073 
0.073 
0.073 
0.073 
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107653 AA010210 Hs.47041 

104708 AA029462 Hs.1 7235 

134082 L16991 Hs.79006 

119180 R80413 Hs.92520 

5 107741 AA016982 Hs.64341 

133683 AA335223 Hs.75558 

111694 R22035 Hs.23331 

120764 AA338729 Hs.133096 

119389 T88826 Hs.90973 

10 100929 HG688-HT688 

119388 T88798 

133019 AF009674 Hs.184434 

105185 AA191495 Hs.189937 

133413 S72043 Hs.73133 

15 101017 J04599 Hs.821 

132865 KQ2765 Hs.251972 

110882 N36001 Hs.17348 

129197 T90303 Hs.109308. 

101184 L19871 Hs.460 

20 134910 AA431320 Hs.9100 

119411 T96621 Hs^03656 

102000 U01824 Hs.380 

114691 AA121893 Hs.103779 

134179 U53204 Hs.79706 

25 134503 U34880 Hs.84183 

129719 N66396 Hs.167766 

113916 W80464 Hs.31928 

113897 W73926 Hs.4947 

30 129697 R00841 Hs.172069 

112078 FU4155 Hs.1 12218 

121980 AA429886 Hs.110407 

100898 HG4638-HT5050 

121626 AA416974 Hs.98174 

35 133670 AA243416 Hs.75470 

131879 AA017161 Hs.33792 

100254 D38037 Hs.77643 

133194 AA291726 Hs.67201 

106081 AA418394 Hs.25354 

40 115544 AA351433 Hs.66187 

119955 W87460 Hs.58989 

104407 H61361 Hs.102171 

135019 X58431 Hs.98428 

114815 AA161488 Hs.103931 

45 119471 W31352 Hs.55445 

117788 N48292 Hs.46849 

119406 T95064 Hs.193771 

130777 R61742 Hs.256554 

130494 L13197 Hs.75874 

50 104107 AA424111 Hs.12598 

121483 AA411981 Hs^5274 

104451 M13299 Hs.102119 

118027 N52770 Hs.75968 

109419 AA227560 Hs.86987 

55 115783 AA424487 Hs.72289 

110585 H62223 Hs.1 33526 

123165 AA488863 Hs.105216 

103966 AA303166 Hs.127270 

109549 F01528 Hs^1192 

60 106730 AA465520 Hs£2313 

120310 AA1 93676 Hs.1 18926 

104078 AA402801 Hs.222010 

117624 N35978 Hs.82364 

112421 R62441 Hs.23127 

65 106958 AA497026 Hs.22059 

129984 W92811 Hs.183927 

122044 AA431456 Hs.98736 

123280 AA491285 Hs.175144 

115710 AA412535 Hs.55235 



ESTs 0.073 

ESTs 0-073 

deoxythymidylate kinase 0-073 

ESTs 0.073 

ESTs 0.073 

pepsinogen 5; group I (pepsinogen A) 0.073 

ESTs 0.073 

ESTs 0.073 

ESTs 0.074 

Major Histocornpatfoility Complex, Ctass li, Dr Beta 2 (Gb:X65561) 0.074 

plasminogen activator inhibitor, type I 0.074 

axin 0.074 

ESTs 0.074 

metallothionein 3 (growth inhibitory (actor (neurotrophic)) 0.074 

biglycan 0.074 

complement component 3 0.074 

ESTs; Wkly smlr to !! ALU SUBFAMILY SQ WARNING ENTRY B [H.sapiens] 0.074 

ESTs; Wkly smlr to leuefne-rich gltoma-inactivated prot precursor [H.sapiensl a074 

activating transcription factor 3 0.075 

ESTs 0.075 

EST 0.075 

solute carrier fam9y 1 (glial high affinity gtutamate transporter); member 2 0.075 

ESTs; Weakly similar to envelope protein [Helens] 0.075 

plectin 1 ; intermediate filament binding protein; 500(d) 0.075 
diptheria toxin resistance protein required for diphthamide 

biosynthesis (Saccharornyces)-like 1 0.075 

ESTs; Moderately similar to Pro-a2(XI) [Ksapiens] 0.075 

ESTs; Wkly smlr to alternatively spliced product using exon 1 3A [Ksapiens] 0.075 

ESTs 0.075 

DKFZP434C212 protein 0.075 

ESTs 0.075 
ESTs; Weakly similar to coded for by C. elegans cDNA yk173c12.5 {C.elegans] a075 

Spliceosomal Protein Sap 49 0.075 

ESTs 0.075 

hypothetical protein; expressed in osteoblast 0.075 

ESTs 0.075 

FK506-binding protein 1B (12.6 kD) 0.075 

ESTs 0.075 

ESTs 0.075 

Homo sapiens done 23700 mRNA sequence 0.076 

ESTs 0.076 

immunoglobulin superfamily containing leucine-rich repeat 0.076 

Human Hox2£ gene for a homeobox protein 0.076 

DKFZP434B0335 protein 0.076 

ESTs 0.076 

ESTs 0.076 

EST 0.076 

ESTs 0.076 

pregnancy-associated plasma protein A 0.076 

T-ceQ lymphoma invasion and metastasis 2 0.076 

ESTs; Modly smlr to putative seven pass transmembrane prot [Rsapiens] 0.076 

blue cone pigment 0.076 

thymosin; beta 4; X chromosome 0.076 

receptor-interacting serine-threonine kinase 3 0.076 

ESTs; Weakly similar to UV-1 protein (H^apiens) 0.076 

ESTs; Wkly smlr to !l!ALU SUBFAMILY SB1 WARNING ENTRY l!![H.sapiens] 0.076 

ESTs; Weakly smir to IIALU SUBFAMILY J WARNING ENTRY II [Ksapiens] 0.077 

ESTs 0.077 

Homo sapiens done 25155 mRNA sequence 0.077 

ESTs 0.077 

DKFZP586K0919 protein 0X77 

ESTs 0.077 

ESTs 0.077 

ESTs 0.077 

ESTs 0.077 
ESTs; Weakly similar to !! ALU SUBFAMILY J WARNING ENTRY U [H^apiens] 0.077 

EST 0.077 

ESTs 0.077 
sphingomyelin phosphodiesterase 2; neutra 
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134129 D87444 Hs.79305 

129321 AA224502 Hs.206501 

130513 AA460257 HS.158B6 

5 100996 J03909 Hs.14623 

128358 AI095718 Hs.135015 

128544 R59352 Hs.1 19273 

106040 AA412681 Hs.125139 

106495 AA452113 Hs.32454 

10 131833 R40899 Hs.32973 

119219 R97176 Hs.1 10783 

135415 X60655 Hs.99967 

109457 AA232646 Hs.68061 

117137 H96670 Hs.42221 

15 107094 AA609614 Hs£241 

130165 T90529 Hs251613 

124072 H05252 Hs.101637 

126151 AA324743 Hs.40808 

119035 R01779 Hs.7740 

20 110157 H18987 Hs.169731 

128515 AA149044 Hs.10086 

133069 U94836 Hs.6430 

112209 R49644 Hs.24865 

133361 R28279 Hs.71848 

25 134714 U89922 Hs.890 

129905 T86796 Hs.132875 

120421 AA236166 Hs.132957 



100885 HG4490-HT4676 



102789 U86759 Hs.158336 

30 120139 Z39273 Hs.77876 

135238 U76343 Hs.96970 

129618 N54845 Hs.173030 

132960 AA609742 Hs.6150 

108751 AA127063 Hs.203717 

35 134060 D42039 Hs.78871 

111338 N79778 Hs.35094 

112345 R56880 Hs.26563 

126456 W00881 

40 128937 Z39939 Hs.10726 

103485 Y08409 Hs.248415 

111202 N6B280 Hs.107922 

132625 AA429890 Hs.166066 

103434 X98085 Hs.54433 

45 102616 U655B1 Hs.159191 

102667 U70867 Hs.83974 

111422 R01127 HS.19104 

101411 M16938 Hs.820 

113267 T65058 Hs.12725 

50 103559 219585 Hs.75774 

131588 AA258613 Hs.29189 

107821 AA020991 Hs.172856 

134278 H82839 Hs.81001 

120893 AA369800 Hs.97058 

55 108786 AA128999 

106890 AA489245 Hs.88500 

119760 W72267 Hs.58219 

132999 Y00787 Hs.624 

60 129156 AA028195 Hs.108973 

121171 AA400008 Hs.161814 

103864 AA207264 Hs.181077 

128591 AA255537 Hs.102057 

122172 AA435753 Hs.161854 

65 112802 R97647 Hs.174855 

107723 AA015967 Hs.60680 

113011 T23737 Hs.1600 

131279 AA089853 Hs.25197 

103190 X70083 Hs.58414 



I membrane (neutral sphingomyelinase) 0.077 

KIAA0255 gene product 0.077 

Homo sapiens done 643 unknown mRNA; complete sequence 0.078 

ESTs 0.078 

interferon; garnma-indudble protein 30 0.078 

ESTs 0.078 

KIAA0296 gene product 0.078 

ESTs 0.078 

ESTs; Moderately similar to KIAA0544 protein [H.sapiensl 0.078 

glycine receptor; beta 0.078 

ESTs 0.078 

even-skipped homeo box 1 (homolog of Drosophila) 0.078 

ESTs; Weakly similar to sphingosine kinase [M.musculus) 0.078 

ESTs 0.078 

ESTs 0.078 

EST 0.078 

EST; Weakly similar to hypothetical protein [Rsapiens] 0.078 

ESTs 0.O78 

ESTs 0.078 

ESTs 0.078 

ESTs; Highly similar to HYPOTHETICAL PROTEIN K1AA0195 [Ksapiens] 0.078 

protein with polyglutamine repeat 0.078 

ESTs * 0.078 

Human done 23548 mRNA sequence 0.078 

lymphotoxin beta (TNF superfamily; member 3) 0.078 

ESTs; Weakly similar to predicted using GeneRnder [C.elegans] 0.079 

ESTs; Weakly similar to chondromoduOn-l precursor [Rsapiens] 0.079 

Praline-Rich Protein Prb4, Allele 0.079 

netrin2(chicken)-like 4 0.079 
Human DNA from chromosome 19-spedfic cosmid R30923; genomic sequence 0.079 

Human BverGABA transport protein mRNA; 3' end 0.079 

ESTs 0.079 

KIAA0521 protein 0.079 

ESTs 0.079 

KIAA0081 protein 0.079 

extracellular matrix protein 2; female organ and adipocyte specific 0.079 

ESTs 0.079 
za56d02.r1 Scares fetal liver spleen 1NFLS Homo sapiens cONA done 

IMAGE296547 5\ mRNA sequence. 0.079 

ESTs 0.079 

thyroid hormone responsive SPOT1 4 (rat) homolog 0.079 

ESTs 0.079 

dsplatin resistance associated 0.079 

tenasdn R (restrictin; janusin) 0.079 

rfcosomal protein L3-fike 0.079 

solute carrier family 21 (prostaglandin transporter); member 2 0.079 

ESTs 0.079 

homeo box C6 0.08 
ESTs; Weakly similar to !! ALU SUBFAMILY J WARNING ENTRY !! [Haptens) 0.08 

thrombospondin 4 0.08 

KIAA1021 protein 0.08 

ESTs 0.08 

ESTs; Weakly similar to DY3.6 (Celegans) 0.08 

EST; Highly simaar to CMP-N-acetylneuraminic add hydroxylase (H^apiens] 0.08 
zo8f12.s1 Stratagene neuroepithetium NT2RAM1 937234 Homo sapiens 

cONA done IMAGE5671 19 3\ mRNA sequence 0.08 

KIAA1066 protein; JSAP1 homolog (mouse); JIP3 homoJog (mouse) 0.08 

ESTs 0.08 

interleukine 0.08 

dolichyl-phosphate mannosyttransferase polypeptide 2; regulatory subunit 0.08 

ESTs 0.08 

ESTs; Weakly similar to Miller-Dieker lissencephaty gene [H^apiensl 0.08 

ESTs; Weakly similar to O-Cnked GlcNAc transferase [H.sapiens] 0.08 

EST 0.08 

EST 0.08 

EST 0.08 

chaperonin containing TCP1 ; subunit 5 (epsiJon) 0X81 

ST1P1 homotogy and U-Box containing protein 1 0.081 

filamin C; gamma (actm-binding protein-280) 0.081 
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AA9Q3A11 


He 9333Z8 


ESTs 


0.081 






He 13fldQ3 


F^Te 

(.Old 


0.081 


126126 


M85370 




EST01R84 Fetal brain. Strataaana fcat#938206\ Homo sanlens cONA 

rlnno HPRflHID mRNA epnitonrfl 
umio nroun iv, iiituim ocv^uoiivaI. 


0.081 


130094 


H43286 


Hs 167017 


gamma-aminobutyric acid (GABA) B racepton 1 


0.031 


100800 


HG3Q4S-HT4215 




Phryenhftlinid Transfar PmfAtn 


01)81 


lUOD/O 


A A1 15940 


He 81818 


ESTs 


0.081 




AA93A95Q 


He QQftlfi 


CO IS 


0.081 


129666 


M77349 


Hs 118787 


transformino Growth factor bata-inducfid* fifikD 


0.081 


IUID40 


M8Q807 


Hs.943 


natural klllpr r^tt tranerrint A 


0.081 


130536 


T17045 


Hs 159492 


soasiic ataxia of Chartevoix-Saouenav fsacsin) 


0.081 


107732 


AA01G181 

f\nxi ID lO I 


Hs.59752 


ESTs 


0.081 


i 93071 




Hs 104285 


ESTs 


0.081 


113537 


T90457 


Hs 191293 


ESTs 


0.081 


101250 


L34060 


Hs 79133 


rariharin ft 


0.081 


122521 


AA44Q433 


Hs 149227 


FSTe- Weaktv similar to PROLINE-RICH PROTEIN MP-3 fM musculusl 


0.081 


133314 

IMS ■*» 


N32811 


Hs 77542 


ESTs 


0.081 


102038 


U05659 


Hs.477 


hvrtrnYvefpmirf /17-hpta\ riahvrimnpnaea ^ 

liyUIOAVvwl OIV 11 f ui 1 UUIIYUIvUCIKI W 


0.081 


110338 


n*H/ooo 


He 174DQ4 


F^Te' Waaklv similar to II At It SUBFAMILY .1 WARNING ENTRY tl fH saotensl 


0.081 


110RT7 
1 lODO/ 


N7097A. 


He 4 Qfl 00 


CO IS 


0.081 


1 1 / ODD 


loo? 


He 04019 


Co 1 b 


0.082 




U07A71 

ntJ/o/ 1 


He IflO^n 

ns. locOtu 


PQTc- Waaitlv efmflar In Mnticn 10 ^ mRNA' mrrmfafa rrfc miienifncl 
CO I 5, VVcalUy ollIliKlf IU IVIUUba laJD llinixM, wIHJJIblW iaJo (fVi.iliUbvUlUaj 


0.082 


1 Wool 


U/OOO L 


He 19*1078 


Hitman mRMi fnr nrnifhtno Horar^nwlneo ftntiTomo* DRF 1 anH ORE 0 


0.082 




T179Q1 


He 101174 

Mb. IUI If *t 


mlrfntuhirtO'aee/viflfoH nmiatn tan 


0.082 


IO£00£ 




He 


WIAAfllitR nana nrnrliirt 


0.082 




73048Q 


He 37R1 
no.o i o i 


Hrwnn conipne RAH rinna Rf511ftn07 from 7n^1 


0.082 




ICOOOO 


Lf e 1 0ft 07 


CO IS 


nnoo 

V.MOC 




MAUOOOO*: 


He 90774 


CO IS t 


0.082 


1 1 lUol 




He QRfl£Q 


CO IS 


v.\joc 




n«K)3f / 


He 1R35Q3 


nKrionm^l nmfoin 1 IRQ 
nUOSvindl JJlULclf 1 L IOa 


0.082 


1u24o3 


U4o4o7 


nS./40DO 


amyloid bata (A4) precursor-lika protein 1 




126204 


AI080388 


Hs.134296 


ESTs 


0.082 


116615 


D80666 


Hs.45203 


ESTs 


0.082 


19QOCC 


A 

AMcIsOOc 


He OAA1/M 
nSiW 1*W 


PQTc* M/vtkf emir in himnr no/S'/tcte fai-tnr.alnha-fnrfnroH nrnl R19 IH eaniflncl 

to i s, iViouiy Qinii iu luinor itoauois louiui *aipna^inuuu;u piui ou [n^xjjjraj ioj 


0.082 


i 40T7C 




He 1AAQA 


CQTc 
CO IS 


0.082 


105494 


A AOCCO70 


He 90900 


nomo sapiens rnnixM, clwa ui\rtp»K>»ri ^uom aone ur\rtpw**r i/h^ 


\J.vOC 


117000 


H84718 


Hs.1 12236 


ESTs; Weakly similar to repressor protein [M.sapiens] 


0.082 


t l£D3D 


Baffin 


He 

no. IOO lu I 


trnnciont rorontnr nntpntinl rhannpl 7 
uanoiwni lovcjjiui (juiciiuai uiaiiiici » 


0.082 




Juocyy 


He 1H7it 
MS.IU/H 


SUnaCwni, pUlnKJ}ldiy*dS5UClalcU ptUiaUl V> 


0 083 


1 1 6957 


n/9^9^ 


ns.oyyou 


CCTe 
COlS 


noon 


101057 


K03430 




Human compJement C1q B^hatn gene, exon A+1 


0.083 


101CMQ 


AAA9QAC9 


He QfmftO 


CCTe 
CO IS 


0.083 


louoZZ 


MoUo4/ 


He 9IY11 


uiromooxane m synuiose i ^piaisiei, cyiocnioiTie rnw, suDidmuy 


0083 


lZZ74o 


A A/4CQC7/1 

AA4000/4 


He 00470 


col 


0.083 


114009 


A ACVWIIC 
AAUCKXJID 




IMfHUlC^i4C94/ O 51(1 Midi VO In.ClaOcO 1 ulwcOl Inlv/nCUVAin 

REDUCTASE ;contains AIu repetitive element;, mRNA sequence 


0.083 


132270 


1 r7AC71 


He >IOROQ 


ataxin 2 related protein 


0 083 


lUol2o 


AAU049&1 


He A7A 1Q 


CCTe 
COlS 


0 083 


lU2ooU 


AU4o25 


He 9A70 


nan iirnr^ii^n nrntain* hata 1* (mnnovin • fHi a r^A t JUI a ftO~TfV^ttl 

gap juncuon proiein, ueia i , odhu ^ujiinuxfi i oc t \j\ iai wi-mdiRr i ouui 
neuropduiy, A-tinKeo/ 


0083 


115365 


AA282089 


nS.oo599 


CCTf 

colS 


0.083 


11*K)£9 


AAOCOQQH 

AAUo^y&u 


He 9flA7ft4 


CCTe 
COlS 


0083 

V.VoO 


135017 


AA249580 


Hs.9315 


CCTr> MfAttrlw rlmiUr MCI IDHMAI Ol CAOTHMCniM QCI ATCH 

co 1 s, woawy similar to NtunuiM/y. utr i uMtLni>i-ncLM i cu 

PR 1 HPAI !7Pr» PRfYTPIM IW eanSonel 

Ln LUvAUttu rnu I ciN ^n.sapiensj 


0 083 


\cot to 


AAo IUU/ 1 


He 1 1901^ 

ns. i itoio 


FCTe 
CO J o 


0.083 


114454 


AA021091 


HS226208 


ESTs 


0.083 


101246 


L33799 


Hs.202097 


procollagen C-endopeplidase enhancer 


0.083 


107366 


U78310 


Hs.13501 


pescadBto (zebrafish) homolog 1; containing BRCT domain 


0.083 


132779 


T89601 


Hs.95497 


ESTs; Weakly similar to GLUCOSE TRANSPORTER TYPE 5; 
SMALL INTESTINE [Rsapierts] 


0.083 


129709 


AA1 12209 


Hs.1209 


acyf-Coenzyme A dehydrogenase; long chain 


0.083 


115244 


AA278767 


Hs.914 


Human mRNA for SB ctassll histocompatibility antigen alpha-chain 


0.083 


123253 


AA490878 


Hs.111334 


ferritin; light polypeptide 


0.083 


128469 


T23724 


Hs^58677 


EST 


0.083 


132220 


AA431847 


Hs.42409 


ESTs; Highly similar to CGM46 protein (Haptens] 


0.083 


111664 


R17939 


Hs22344 


ESTs 


0.083 


102354 


U38268 




Human cytochrome b pseudogene, partial cds 


0.084 


112828 


R98774 


Hs.194338 


ESTs 


0.084 
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1 lv*» Iv 


H47868 


Hs.34024 


ESTs 


102620 


U66052 




Human dons W2-6 mRNA from chromosome X 


iaocca 

lUZOoU 


i rcp.nn.7 


He 14*^1 
no. iHo** i 


cuifin 1 




rtAU/O/ ID 




7mflPpLl el ^tratanonfi ovarian cancer f#937219) H saotens cONA Clone 








1MAGF S4512 3* similar to ob^(14723 CLUSTERIN PRECURSOR 








(HUMAN);, mRNA sequence 




TR79fl5 


He ITflRQ 
no. louoy 


ESTs 


A 170CA 

117869 


Mj*Q047 
N4a94r 




CO 1 o 


1 IJ/OH 




no. 100/ / 


to 1 


1 OOOtO 


C00424 


Hs.7101 


periodontal ligament fibroblast protein 






He 194fttft 
no. ic*400o 


ESTs 


1U10I3 


MOO iOO 


fib. Oil OO 


tranenhitominaep ? /fl noK/npniidfi* nrotaln^tutaminfi 








-/ia mm a -nit itamvtira nefora ep\ 


iiyocic 




no. isoooo 


F*iTe' MAHarntoh/ eimilar to attsmativatv soiiced Droduct 








iieinn ovnn 11A fH eanipnel 
liOiiiu wauji ion if l.oautciioj 


IcOOcO 


AAfiPAfiRfi 


Hs 112884 


EST 


lUoOl 1 




He 113Q73 
no. i looro 


muAcirv fwaw nntvDeotida 8* skeletal musde* Dsrinatal 




/VVWOvoV 


no-t qooh 


F^Tc- Wflakfv eimilar to ION CHANNEL H0M0LOG RIC 








PRECURSOR fM musculusl 


128670 


T1CQQC 

I looyo 




to is 


1 30814 




He 1Qft.11 

ns. i ao lo 


FQTe 

CO 1 O 


loiwyi 




He 797 


inhfhin- hpta A /nrfivin A* activin A8 atoha Delvneotidel 

II U null l, UcLd r\ ^abUVlll t\ avuvui txo aipna yuiyycyvwj 




fvYhjf IDO 


He 1 1<W17 


FSTc* Woaktv similar to ceded for bv C eleaansrcDNA vk173c12^ IC eleaansl 


109284 


AA1QAGQR 

AAiyoyyo 


He ftfifiQO 


CO la 


i 1CCQO 




He fififtQQ 

ns.oouyy 


CO 1 0 


100545 


lir^rti *7 11X004 "7 

HG2147-HT2217 




Mitrtn *4 Intactinal /r2K-Mi;(^n^\ 

Mucin o, iniesunai \ud.wdo«*uo/ 


IUZ004 


UDDr 1 1 


He 77RR7 


lyiiipiiuuyio QJitiytJii o ujnifJtcA, iuuuo u 




DOCOQQ 
KcOooy 


He ooocc 
no.coooo 


F^Te* Wpaklv eimilar tn FAST kina^ IU saoiensl 
uOl o T t wedwy oillllloi vJ rrto i wiiaov \* i-oopioiioj 


1 AC "101 


AA10fift7ft 
AAiyUD/O 


He 1H07d 
no. 1 Ira / *♦ 


CO 1 Of IVlUUUiaTOiy ollMUol IU UUIUIUWM [ri.iiui»oyiLrUOj 


122681 


AA4O0O3U 


ns.yy4ui 


FQT 
CO 1 


114543 


* A AC£H H 

AA056121 


nS.ioo4iy 


CQTe 
CO IS 


133597 




He 711 

ns./oioy 


nartnor Af RAH1 /arfanfin 9\ * 


l£luo4 




He 07/tnR 


CO 1 b 




A A4OCOR0 


He 10770ft 
no. 19/ tcO 


F*!Te* Waaklv similar to ZINC FINGER PROTEIN 135 fH saoiensl 


100309 


ncAccn 
UoOtwU 


He QCRCQ 

rvs.yoooy 


totfial Aianl tatvao /nrAeAnhi(a\ hnmntrm 1 
loUlaiyiaiU laivda iviuoupiiiiaj nujiiuiuy i 


101/27 


M/o4ol 


He 71ftfl1 

no./oooo 




131226 


Ml D3400 


nS.£44ro 


CO IS 


133580 


A AflOCAjd 

AA09oU4i 


He imn7^ 
ns. loiu/o 


CO IS 




Uo/yo4 


He 9i>7R7fi 


RTP hinrlinn nmtmn 1 


104S7O 


A AAOC/fQA 


He 1A0RRQ 


ceTc* Weaklv similar to !l ALU SUBFAMILY J WARNING ENTRY 1' fKsauiensl 


120865 


AA350631 


nS.ybbbo 


COl 


i AC AO A 

106080 


A A>l tOflAC 

AA4IOU40 


He 011 OA 


FSTe 
CO lo 


128571 


M416619 


HS.IOIool 


co IS 


101838 


M92934 


Hs.75511 


connective tissue growth factor 


128514 


H84261 


LJ_ 4ftft(WO 

Hs. 100843 


co 1 s, weawy similar 10 similar 10 u i r-oinouig proiBin iis.eiegansj 


123039 


AA485931 


MS.79 


Bmincecyiaso i 


134067 


Y08200 


m. "7CKVOA 

HS.78y20 


nan yeranyigQranyiudnsTerase, aipnu omjuim 


116967 


H80336 


u» Am OA 
nS,4U1^4 


cot 


110053 


H1258D 


He fiQIRO. 

n 5.030 do 




114395 


AA007313 


MS.1 IUIdo 


CO IS 


107465 


W44681 


HSxblooo 


iTHJiuiy luiruvuuo uiieyidUUii otw i Nuuiuiuy 


101983 


S85655 


nS./oo^O 


prohibitin 


112544 


R70948 


11 _ OOI CO 

Hs^yloo 


CQTe 
CO IS 


111423 


R01165 


U« -tOQ CAT 


CO IS 


127918 


AA806043 


MS.1 lDoyo 


Hitman normftno InH rhain nono' P-rpnifin" fV/tAfta-l riAmain 


m7orv\ 


TA7WAO 


HeOfUfifl 
no.owwo 


ESTs 


134947 


R51194 




yj7 1a08.r1 Soares breast 2NbHBst Homo sapiens cONA clone 1MAGE:1 54166 








5' similar to gb:L1 1284 DUAL SPECIFICITY MITOG EN-ACTIVATED PROTEIN 








KINASE KINASE 1 (HUMAN);, mRNA sequence. 


124579 


N68345 


Hs.127179 


ESTs; Weakly similar to TERAT0CARCIN0MA-0ERIVED GROWTH 








FACTOR 1 (H^apiens) 


130471 


Z68280 


Hs.183706 


addutin 1 (alpha) 


116596 


060755 


Hs.92955 


ESTs 


105069 


AA136345 


Hs23617 


ESTs; Weakly similar to ZF0C1 gene product [H.sapiens] 


102491 


U51010 




Human nicotinamide N-methyi transferase gene, exon 1 and 5' flanking region 


130069 


AA055896 


Hs.146428 


collagen; type V; alpha 1 


130234 


AA280413 


Hs.157441 


spleen locus forming virus (SFFV) proviral integration oncogene spil 


120540 


AA262992 


Hs.96417 


ESTs 


122508 


AA449221 


HS20432 


ESTs 



0.084 
0.084 
0.084 



0.084 
0.084 
0.084 
0.084 
0.084 
0.084 

0.084 

0.084 
0.084 
0.084 

0.084 
0.084 
0.084 
0.084 
0.084 
0.084 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 



0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.086 
0.086 
0.086 
0.086 



0.086 

0.086 
0.086 
0.086 
0.086 
0.086 
0.086 
0.086 
0.086 
0.086 
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128054 


AI205718 


Hs.125416 


ESTs 


0.086 


133020 


AA053248 


Hs.185182 


ESTs; Highly similar to 40S RIBOSOMAl PROTEIN S10 [H.sapiens] 


O.086 


130056 


AA017356 


Hs.171900 


armadillo repeat gene deletes in vetocardiofadaJ syndrome 


0.086 


130504 


U46865 


Hs.158323 


CCAAT/enhancei binding protein (C/EBP); epsilon 


0.086 


133978 


W73859 


Hs.78061 


transcription factor 21 


0.086 


105265 


AA227941 


HS56088 


ESTs 


0.086 


133035 


T15965 


Hs.6333 


ESTs 


0.086 


100768 


HG3636-HT3846 




Myosin, Heavy Polypeptide 9, Non-Muscle 


0.086 


129338 


T56800 


Hs.47274 


Homo sapiens mRNA; cONA DKFZp564B176 (from clone DKF2p564B176) 


0.086 


132789 


W23761 


Hs.56876 


ESTs 


0.086 


116099 


AA456309 


Hs.58831 


regulator of Fas-induced apoptosis 


0.086 


100721 


HG3355-HT3532 




Peroxisome Prolif erator Activated Receptor (Gb230972) 


0.087 


112569 


R73150 


Hs.75270 


GTP-binding protein homotogous to Saccharomyces cerevisiae SEC4 


0.087 


130645 


AA020942 


Hs.17200 


STAM-Oke protein containing SH3 and ITAM domains 2 


0.087 


100751 


HG3527-HT3721 




Luteinizing Hormone, Beta Subunit 


0.087 


134550 


M27161 


Hs.85258 


C08 antigen; alpha polypeptide (p32) 


0.087 


130885 


AA338646 


Hs£0912 


adenomatous polyposis cofi like 


0.087 


101446 


M21302 


Hs£6306 


small proCne-rich protein 2A 


0.087 


116287 


AA487856 


Hs.155829 


KIAA0676 protein 


0.087 


134034 


X89267 


Hs.78601 


uroporphyrinogen decarboxylase 


0.087 


130860 


U66061 


HsJ241395 


protease; serine; 1 (trypsin 1) 


0.087 


109901 


H04992 


Hs.30499 


ESTs 


0.087 


107537 


Z20777 


Hs.9857 


ESTs; WeaWy similar to peroxisomal short-chain alcohol 
dehydrogenase [Ksapiens] 


0.087 


133232 


AA496030 


Hs.6845 


ESTs 


0.087 


108559 


AA085161 




zn12c5.s1 Stratagene hNT neuron (#937233) H sapiens cONA done 
IMAGE:54728 3 similar to TR.-G1151228 G1 151228 LPG1 P. ;,mRNAseq 


0.087 


121288 


AA401735 


Hs.97340 


EST 


0.087 


108844 


AA132916 


. Hs.1 77961 


Human Chromosome 16 BAC clone CIT987SK-A-388D4 


0.087 


129874 


AA406488 


Hs.181551 


ESTs 


0.087 


105139 


AA164543 


Hs.1 10082 


ESTs 


0.088 


124789 


R43803 


Hs.78110 


ESTs; Weakly similar to F17A92 [C.elegans] 


0.088 


115923 


AA441929 


Hs.38205 


ESTs 


0.088 


123640 


AA609292 


Hs.1 12681 


ESTs 


0.088 


131607 


AA351409 


Hs.172740 


microtubule-assotiated protein; RP/EB family; member 3 


0.088 


130064 


T67053 


Hs.181125 


immunoglobulin lambda gene duster 


0.088 


108752 


AA127070 


Hs.71055 


ESTs 


0.088 


124249 


H68077 


Hs.1 08211 


ESTs 


0.088 


100109 


AJ000480 


Hs.1 43513 


phosphoprotein regulated by mitoge nsc pathways 


0.088 


104642 


AA004662 


Hs.1 84245 


KIAA0929 protein Msx2 interacting nuclear target (MINT) homotog 


0.088 


131752 


AA453311 


Hs.31566 


ESTs 


0.088 


114727 


AA1 32545 


Hs.1 90202 


ESTs 


0.088 


120965 


AA398089 


Hs.179715 


ESTs 


0.088 


100396 


D84361 


Hs.1 51 123 


Human mRNA for p52 and p64 isoforms of N-Shc; complete cds 


0.088 


106218 


AA428451 


Hs.91146 


DKFZP586E0820 protein 


0.088 


111562 


R09567 


Hs.187569 


ESTs 


0.088 


121219 


AA400606 


Hs.1 44344 


EST 


0.088 


101187 


L20316 


Hs^08 


glucagon receptor 


0.088 


101513 


M28210 


H&27744 


RAB3A; member RAS oncogene family 


0.088 


116454 


AA621071 


Hs.42034 


ESTs; Moderately similar to T-complex protein 10A (H.sapiens) 


0.088 


116171 


AA463434 


Hs.42658 


ESTs 


0.089 


117500 


N31909 


Hs.44278 


ESTs 


0.089 


119978 


W88623 


H&59190 


EST 


0.089 


132005 


D58231 


Hs.1 73091 


DKFZP434K151 protein 


0.089 


109914 


H05529 


Hs.194704 


teudne-rich; glioma inactivated 1 


0.089 


130370 


M55265 


Hs.155140 


casein kinase 2; alpha 1 polypeptide 


0.089 


104262 


AFO09801 


Hs.105941 


bagpipe homeobox (Drosophiia) homolog 1 


0.089 


129708 


AM17181 


Hs.120858 


ESTs 


0.089 


106398 


AA447545 


Hs.1 8268 


adenylate kinase 5 


0.089 


120884 


AA365356 


Hs.97041 


ESTs 


0.089 


130404 


X72012 


Hs.76753 


endogBn (Qster-Rendu-Weber syndrome 1) 


0.089 


114072 


Z38184 


Hs.123633 


ESTs 


0.089 


131470 


X54938 


HS2722 


inositol 1;4;5-tnsphosphate3-kinase A 


0.089 


124573 


N67935 


Hs.194703 


adaptor-related protein complex 4; mu 1 subunit 


0.089 


114717 


AA131240 


HS252014 


EST 


0.089 


133806 • 


M12759 


Hs.76325 


Human tg J chain gene 


0.09 


130470 


AA398552 


Hs.15711 


K1AA0639 protein 


0.09 


133182 


ZB0787 


Hs.240135 


H4 histone family; member J 


0.09 


116036 


AA452572 


Hs.43866 


ESTs 


0.09 
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132404 
122695 
125975 
110783 
129860 
120740 
119564 
134474 
119014 
109791 
117605 
121589 
104326 
129861 
102795 
119626 
110516 
105382 
123754 
108008 
121057 
123675 
135194 
127070 
134051 
133382 
103615 
118457 
118504 
112915 
132088 
101504 
112550 
128551 
112879 
127079 
101993 
113020 
120465 
130152 
104941 
110090 
135375 
123799 
118966 



125147 
100836 
114726 
107311 
112863 
129290 
103384 

112508 
111863 
131184 
107420 
111768 
112290 
130581 
120744 
112226 
116154 
102640 
129797 
102705 
132408 
108441 



AA393903 

AA456048 

AA495891 

N23669 

AA410343 

AA302650 

W38206 

AA054746 

N95435- 

F10669 

N35073 

AA416627 

D81655 

N69507 

U88667 

W49499 

H56894 

AA236853 

AA609964 

AA039430 

AA3S8619 

AA609474 

C20975 

AA641812 

S67070 

Ml 12532 

Z46967 

N66593 

N67334 

T10176 

AM70121 

M27288 

R71391 

H09058 

T03541 

AI364691 

U01062 

T23830 

AA251505 

U32645 

AA065169 

H16076 

AA480888 

AA620418 

N93438 

H80833 

W38150 

HG4113-HT4383 

AA132509 

T57738 

T03148 

AA521407 

X92762 

R68213 

R37495 

AA452705 

W26567 

R27606 

R53940 

AA481982 

AA302772 

R50761 

AA460951 

U67674 

X53595 

U77180 

AA035547 

AA078079 



Hs.4768 

Hs.99403 

Hs.152290 

Hs^6407 

Hs.129826 

Hs.96654 

Hs.8379 

Hs^5144 

Hs.13228 

Hs.44433 

Hs.191598 

Hs.143067 

Hs.129849 

Hs.1 98396 

Hs.184456 

Hi37368 

Hs.111801 

Hs.102021 

Hs.61920 

Hs.142375 

Hs.112713 

Hs.9613 

Hs.190037 

Hs.78846 

Hs.7247 

Hs.1 15460 

Hs.49230 

Hs.50158 

Hs.4254 

Hs.243960 

Hs.248156 

Hs.29074 

Hs.237323 

Hs.1 15960 

Hs.128628 

Hs.77515 

Hs.7303 

Hs.130861 

Hs.151139 

Hs.17805 

Hs.6915 

Hs.99741 

Hs.1 12861 

Hs.76907 

Hs.143038 



Hs.103827 
Hs.174112 
Hs.4610 
Hs.1 10095 
Hs.79021 

Hs£8847 

Hs.23578 

Hs.23954 

Hs.4775 

Hs.24185 

Hs^6016 

Hs.16258 

HSJ228649 

Hs£5738 

Hs.57100 

Hs.194783 

Hs.1252 

HsJ0002 

Hs.47822 



ESTs 0.09 

ESTs; Moderately similar to undufin 2 [H^apiens] 0.09 

ESTs; Highly simDar to PACAP type-3MP type-2 receptor [H.sapiens] 0.09 

ESTs 0.09 

tetraspan transmembrane 4 super family 0.09 

EST 0.09 

Accession not listed in Genbank 0.09 

ESTs 0.09 

ESTs 0.09 

DRE-antagonist modulator; caJseniiiri 0.09 

ESTs 0.09 

ESTs 0.09 

ESTs 0.09 

DKFZP564M182 protein 0.09 

ATP-binding cassette; sub-family A (ABC1 ); member 4 0.09 

ESTs; Wkly smtr to !! ALU SUBFAMILY SX WARNING ENTRY 0 [H.sapiens] 0.09 

EST 0.09 
Homo sapiens mRNA; cDNA DKF2p564H2023 (from clone DKFZp564H2023) 0.09 

ESTs 0.09 

ESTs 0.09 

ESTs; Moderately similar to putative envelope protein [H.sapiens] 0.091 

EST 0.091 

ESTs; Highly similar to angiopoietirvrelated protlin [Ksapiens] 0.091 

ESTs 0.091 

heat shock 27kD protein 2 0.091 

ESTs 0.091 

caRcin 0.091 

EST 0.091 

ESTs 0.091 

ESTs 0.091 

HLA-B associated transcript-3 0.091 

oncostatin M 0.091 

ESTs 0.091 

N-acetylglucosamine-phosphate mutase; DKFZP434B1 87 protein 0.09 1 

ESTs 0.091 

ESTs; Moderately similar to CL3BC [R.norvegicus] 0.091 

inositol 1 ;4;5-triphosphate receptor; type 3 0.091 

ESTs; Weakly similar to PROHIBITIN [Ksapiens] 0.091 

ESTs 0.091 

' E74-like (actor 4 (ets domain transcription factor) 0.091 

ESTs 0.091 

ESTs 0.091 

ESTs; Weakly similar to BRAIN PROTEIN H5 [Ksapiens] 0.091 

ESTs 0.092 

ESTs; Highly similar to HSPC002 IRsapiens] 0.092 

ESTs 0.092 

Accession not Bsted in Genbank 0.092 

Olfactory Receptor Or17-201 0.092 

EST 0.092 

ESTs 0.092 

EST 0.092 

ESTs , 0.092 
tafazzin (cardiomyopathy; dilated 3A (X-linked); endocardial 

fibroelastosis 2; Barth syndrome) 0.092 

ESTs 0.092 

ESTs 0.092 

ESTs; Weakly similar to KIAA0584 protein [H.sapiens] 0.092 

ESTs 0.092 

ESTs 0.092 

ESTs 0.092 

ESTs; Weakly similar to RAS-RELATED PROTEIN RAB-5A [H^apiensJ 0.092 

EST 0.093 

ESTs 0.093 

ESTs 0.093 

solute carrier family 10 (sodium/bile acid cotransporter family); member 2 0.093 

apo lipoprotein H (beta-2-gtycoproteln I) 0.093 

small inducible cytokine subfamily A (Cys-Cys); member 1 9 0.093 

K1AA0380 gene product; RhoA-spedfic guanine nucleotide exchange factor 0.093 
zm97c9.s1 Stratagene colon HT29 (#937221) Homo sapiens cDNA clone 
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108145 
106466 
101697 
121294 

117824 
115771 
102303 
131405 
112909 
124173 
112488 
130554 
106413 
111711 
117595 
113813 
107769 

114966 

130297 
109589 
112592 
102314 
116128 
106809 



130607 
120592 
117230' 
105948 
101333 
101909 



10 



15 



20 



25 



30 



35 



40 127034 
134430 
120342 
104450 
130902 
102708 
107373 
123569 
102687 
128888 
50 100283 
102747 
107798 
123565 
116010 
117155 
133094 
113174 
102016 
130126 
134813 
132055 
122229 
127574 
134432 
128052 
101637 
103386 
133079 
120328 



AA054133 
AA449990 
M64358 
AA401958 

N49065 

AA422049 

U33053 

U79255 

T10069 

H41281 

R66896 

X59303 

AA447964 

R22891 

N34933 

W45174 

AA018449 

AA250743 

H94949 

F02429 

R77631 

U34038 

AA459915 

AA479704 



AA043894 

AA281929 

N20535 

M404597 

L47738 



AA497031 

AA352389 

H52105 

AA207105 

L77564 

AA424530 

U77594 

U85773 

AA608952 

U73379 

AA034951 

D43642 

U79303 

AA019346 

AA608907 

AA449450 

H97536 

AA1 15572 

T54659 

U03270 

AB002318 

X14767 

N69440 

AA436198 

AA907314 

AA053022 

AA878398 

M58285 

X92972 

AA477561 

AA196979 



Hs.63085 
Hs.76057 

Hs.240170 

Hs.125201 

Hs.40780 

Hs.2499 

Hs.26468 

Hs.101094 

Hs.107619 

Hs.28788 

Hs.159637 

Hs.6311 

Hs.7093 

Hs.44664 

Hs.31382 

Hs.125220 

Hs.92198 

Hs.171955 

Hs.6581 

Hs.29126 

Hs.154299 

Hs.1 12193 

Hs.220324 



Hs.16603 

Hs.143974 

Hs.43265 

Hs.7133 

Hs.80313 

Hs.8657 

Hs.8309 

Hs.45068 

Hs.103978 

HS21061 

Hs.37682 

Hs.154695 

Hs.195292 

Hs.93002 

Hs.106893 

Hs.2430 

Hs.82482 

Hs.60918 

Hs.1 12614 

Hs.56421 

Hs.42391 

Hs.64746 

Hs.9779 

Hs.122511 

Hs.150443 

Hs.89768 

Hs.38132 

Hs.103902 

Hs.188905 

Hs.8312 

Hs.190491 

Hs.132834 

Hs.80324 

Hs.6449 

Hs.104129 



IMAGE:545872 3" similar to contains element MER22 MER22 repetitive 

element ;, mRNA sequence 0.093 

ESTs 0.093 

rysophospholipasell 0.093 

Human rhom-3 gene, exon 0.093 
ESTs; Moderately similar to alternatively spliced product using 

exon13A(H.sapiens] 0.093 

ESTs; Weakly arrflar to B7 [M.musculusl 0.093 

ESTs 0.093 

protein kinase C-Bke 1 0.093 

amyloid beta (A4) precursor protein-binding; family A; member 2 (X1 1 -like) 0.093 

ESTs 0.093 

ESTs 0.093 

ESTs 0.093 

valyMRNA synthetase 2 0.093 

ESTs 0.093 

ESTs 0.094 

EST 0.094 

ESTs 0.094 
Homo sapiens DNA from chromosome 19-cosmids R30102fl29350:R27740 

containing MEF2B; genomic sequence 0.094 
ESTs; Highly similar to cata'um-regulated heat stable protein 

CRHSP-24 ^.sapiens] * 0.094 

trophinin-assisting protein (tastin) 0.094 

ESTs 0.094 

ESTs 0.094 

coagulation factor II (thrombin) receptor-like 1 0.094 

mutS (E. colt) homolog 5 0.094 
Human DNA sequence from done 283E3 on chromosome 1p36.21-36.33. 
Contains the alternatively spliced gene for Matrix Metalioproteinase in the 
Female Reproductive tract MIFR1; -2; MMP21/22A; -B and -C; a novel gene; 

the alternatively spBced CDC2L2 gene for 0.094 

ESTs 0.094 

ESTs 0.094 

melastatin 1 0.094 

ESTs 0.094 

p53 Inducible protein 0.094 

Homo sapiens mRNA for PLE21 protein; complete cds 0.094 

ESTs; Highly similar to CTG7a [H.sapiens] 0.094 

ESTs; VvTdy smlr to g!ucose-6*phosphatase catalytic subunit (R.norvegicusl 0.095 

KIAA0747 protein 0.095 

Homo sapiens mRNA; cDNA DKFZp434l143 (from clone DKFZp434i143) 0.095 

serine/threonine kinase 22B (spermiogenesls associated) 0.095 

ESTs 0.095 

retinoic add receptor responder (tazarotene induced) 2 0.095 

phosphomannomutase 2 0095 

ESTs; Weakly simBar to RNA heDcase HDB/D1CE1 IH.sapiens) 0.095 

ubiquitin carrier protein E2-C 0.095 

ESTs 0.095 

transcription factor-like 1 0.095 

protein predicted by done 23882 0.095 

EST 0.095 

EST - 0.095 

ESTs; Weakly similar to Similarity to H.infiuenza ribonudease PH [C.elegans] 0.095 

EST 0.095 

chloride intracellular channel 3 0.095 

ESTs 0.095 

centrin; EF-hand protein; 1 0.095 

KIAA0320 protein 0.095 

gamma-aminobutyric acid (GABA) A receptor; beta 1 0.095 

ESTs 0.095 

ESTs 0.096 

ESTs 0.096 

ESTs 0.096 

ESTs 0.096 

hematopoietic protein 1 0.096 

protein phosphatase 6; catalytic subunit 0.096 

ESTs 0.096 

ESTs; Weakly similar to protease [H.saptens] 0.096 
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107640 


M009815 


Hs.257808 


ESTs 


0.096 


123389 


AA521176 


Hs.221231 


ESTs 


0.096 


103222 


* X74795 


Hs.77171 


minichromosorne maintenance deHcisnt (S. cerevisiae) 5 (cell division cycle 46) 


0.096 


111704 


R22450 


Hs.23396 


ESTs; Highly similar to ZINC FINGER PROTEIN 1 40 [H.sapiens] 


0.096 


126656 


AAS06523 




EST177475 Jurkat T-ceBs VI Homo sapiens cONA 5' end, mRNA sequence. 


0.733 


127071 


AA250806 




ESTs 


0.096 


114550 


AA056755 


Hs.151714 


ESTs 


0.096 


125955 


AI356943 


Hs. 143761 


ESTs 


0.096 


134363 


M37033 


Hs.82212 


CD53 antigen 


0.096 


128550 


W76492 


Hs.170142 


ESTs 


0.096 


122598 


AA453465 


Hs.99329 


ESTs 


0.096 


118898 


N90703 


Hs 4236 


KIAA0478 aene Droduct 


0.096 


117661 


W39092 


Hs.44940 


ESTs 


0.096 


120996 


AA398281 


Hs. 143684 


ESTs 


0.096 


123388 


AA521172 


Hs.134417 


ESTs 


0.096 


106700 


AA463929 


Hs.28701 


ESTs 


0.096 


112962 


T16814 


Hs.6828 


ESTs 


0.096 


121262 


AA401372 


Hs 97723 


ESTs 


0.096 


134551 


R44839 


Hs 8526 


i-hp.ta-1 '3-N-flRfilvkiltimsAmlm/ftrflnsfsfasfi 


0.096 


112060 


R43754 


Hs21164 


ESTs 


0.096 


134678 


AA039935 


Hs.182595 


dynein; axonemal; light polypeptide 4 


0.096 


100855 


HG4234-HT4504 




Mfithvtonfltfitrahvdrofolatfi RfldLrdasG 
(viouiyiuiioicuaiiyuiujuiauj nowwjo 


0.097 


1Q9414 
IOCH 1*4 


" N91193 


H&.48145 . 


ESTs ~ 


0.097 


1129D0 


708758 


Hs.3813 


ESTs 


0.097 


115989 


AA447777 


Hs.93135 


ESTs 


0.097 


103561* 


Z21488 


Hs.143434 


wintarlin 1 


0.097 


131087 


AAD09738 


Hs.22824 


ESTs; Weakly similar to pi 60 myb-binding protein [M.musculus] 


0.097 


120293 


AA1 90859 


Hs.191428 


ESTs 


0.097 


111830 


R36081 


H«v25Q85 


EST 


0.097 






He 17RR6 
113. 1 / uuu 


ESTs 


0.097 




AA17Q338 


Hs.5476 


eorinn nmtfitnaeA Inhibitor 

dot II ID piv/lDUiaOO IIIIMUItUI 


0.097 


120162 


24Q125 


Hs 91968 


ESTs 


0.097 


1QOQ7Q 


1 J1fi9fli> 


He 5R81 


PI 1 nana M1-1Q K/elnfl.rioh Ipitkamta nenfi^ * 
yen io \ i ii9 lyoii ic iiw i icunciiua yoi 10/ 


0.097 


IOh£ 1 1 


AAUODDO 1 


no.ouut 1 


P<?Te* lA/aaktv eimiiar in 890Q n fT) motannnaetprt 


0.097 






He 165051 


ESTs 


0.097 


118118 


N56901 


Hs 47995 


ESTs 


0.097 


107598 


AA004528 


Hs 169444 

1 ID* 1 VJ t T 1 


ESTs 


0.097 


198Q33 


H01fi?4 


Hs.760 


fiATA-hlnffinn nrnlpin 9 
\jAin i/iiiuiiiy ^/t\ju}»% l. 


0.097 






HS.8BQ24 


ESTs 


0.097 




S75168 


Hs b 274 


mAnakarunn/tf^aecftciatprf h/rnerrifl kina^fi 


0.097 




AA959374 


Hs 19333 


P^Te* Wpaldv eimilar tft ATP/fiTPVhinriinn orotsbi IH saolensl 

tv> 1 0, Hcowy oiiiiiiai ivnir \vj 1 r ^un hjii iy pi widh i i^a^icmj 


0.097 


128155 




Hs. 143302 


ESTs 


0.097 


11R978 




Hs.44914 


ESTs 


0.097 


111964 


R41227 


Hc_21860 


ESTs 


0.097 


145100 


AAOQRQOC 


He 951108 


Hnmn canione mRNA* f+ironv\enmfl 1 enprjftc trflnerrinl KIAA0AP3 


0.097 




rvoy t d 1 


He 101508 


EST 


0.097 


103084 




Hs.77793 


p. err tvrosinfi kinase 


0.097 


124138 


H931Q9 
rut? J9? 


Hs 107010 


ESTs 


0.098 


lOVA/tO 


no 1 /HQ 


H«s211612 


SP024 fS cprfivkiae^ rfihtpd oonfi famihr member A 


0.098 


1009 or 


L/tD It 3 


He 7R994 




0.098 


123537 


AA508775 


He 112589 


ESTs 


0.098 




in aou 1 9 


He 55099 


CCTe 
coi s 


0.098 




WOUOO** 


He Q853 




0.098 


IIOOIQ 


RQRR1R 


Hs.35984 


ESTs 


0.098 




JUOUUO 


We 9971 




0.098 


127353 


AA1 90853 


Hs.155360 


ESTs 


0.098 


132068 


X66365 


Hs.38481 


cydin-dependent kinase 6 


0.098 


105744 


AA293436 


Hs.12909 


ESTs 


0.098 


133680 


M92357 


Hs.101382 


tumor necrosis factor; alpha-induced protein 2 


0.098 


122899 


AA469960 


Hs.178420 


ESTs; Highly similar to WASP interacting protein [H^apiens] 


0.098 


128700 


U59286 


Hs.103982 


small inducible cytokine subfamily B (Cys-X-Cys); member 11 


0.098 


104393 


H46486 


Hs.226499 


nesca protein 


0.098 


123320 


AA496792 


Hs.139572 


EST 


0.098 


129169 


N31641 


Hs.109058 


ribosoma! protein S6 kinase; 90kD; polypeptide 5 


0.098 


135093 


U51333 


Hs.159237 


hexoWnase 3 (whfte cell) 


0.098 


113269 


T65159 


Hs.85044 


ESTs 


0.098 


124283 


H86783 


Hs.194136 


ESTs; Moderately similar to zinc finger protein RIN ZF (R.norv9gicus] 


0.098 


114376 


GMCSF 




Accession not fisted In Genbank 


0.099 


100881 


HG4458-HT4727 




Immunoglobulin Heavy Chain, Vdjc Regions (Gb±23563) 


0.099 
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116572 


D45654 


Hs.655fi2 


DKFZP586C1324 protein 


0.099 


404QCC 




He 119R47 
nj. t \cohi 


EST 


0.099 


innflift 

IUUOIO 


nwu i on i **coo 




Opioid-Bindmg Cell Adhesion Molscuto 


0.099 


132754 


W47419 




Human DNA front chromosome 19-specific cosmid F25965; genomic sequence 


0.099 


1 19741 


n90uou 




ESTs 


0.099 


112748 


R93299 


Hs 166492 


ESTs 


0.099 


IOUOOO 


OOIC.OO 




CDG8 flntinfln 


0.099 


124870 


R69233 


H&1015O4 


ESTs 


0.099 




239833 


He 124940 


G TP-bin dino nrotein 


0.099 


121297 


AA401995 


Hs.97860 


ESTs 


0.099 


128602 


AA046103 


Hs.102367 


ESTs 


0.099 


124062 


H00440 


Hs.144524 


ESTs; Weakly similar to signal transducer and activator of 
transcription 2 fM musculusl 


0.099 


1 IAJCN f 


Hf»14Q-HT221Q 




Mucin fGb'M57417\ 


0.099 


105652 




Ha. 19015 


ESTs 


0.099 




A CMC 


Ha.72660 


KIAA0585 protein 


0.099 


133503 


M33195 


Hs.743 


Fc fragment of IgE; high affinity 1; receptor for; gamma polypeptide 


0.099 


1 OQ4A1 




H&58210 


ESTs 


0.099 


IU£vuO 


[ IDQ1 17 


Hcfi077fi 


nhnenhnltnaeo C* dpfta 1 


0.099 




TftfiQ31 




ESTs 


0.099 


1 (\49Afl 


ARfifW^Afl 
MDWtOOO 




KIAA0370 nrotein 


0.099 


l£ 1 MJ 


A AQQQinQ 

AMoyy ivy 


He 1R1R11 
n*. id io to 


ESTs 


0.1 


122090 




He Q7QQQ 

ns,y/oyy 


poTe* WpaIcIv simitar to rial?- krr^43- CAJ- 0 I^ALC YEAST P25335 

CO 1 J>, V VcaRiy a Ml Dial IU Ual£, fat 1. 0*+0| \*m. v. 1 1 , nuv/_ 1 1— rw i r£.vhjo«j 

Al LANTOICASF rS.rarevieiae1 


0.1 


102405 


U43148 


Hs.159526 


patched (Drosophita) homolog 


0.1 


103599 


Z33905 


Hs.81218 


receptor-associated protein of the synapse; 43kD 


0.1 


121079 


AA398719 


Hs.14169 


ESTs; WeaWy similar to CREB-binding protein [Rsapiens) 


0.1 


115820 


AA427487 


Hs.39619 


ESTs; WeaWy similar to RETICULOCALBIN 1 PRECURSOR [Rsapiens) 


0.781 


125106 


T95766 


Hs.189760 


ESTs 


0.1 


131373 


N68116 


H&26146 


Down syndrome critical region gene 3 


0.1 


120224 


Z41239 


Hs.106960 


ESTs 


0.1 


133090 


AA448228 


Hs.6468 


ESTs 


0.1 


132300 


AA133244 


Hs.44234 


ESTs 


0.1 


113129 


T49384 


Hs.8988 


EST 


0.1 


110638 


H73197 


Hs.17241 


ESTs 


0.1 


131364 


R53255 


Hs.26010 


ESTs 


0.1 


105370 


AA236476 


Hs.22791 


ESTs; Weakly similar to transmembrane protein with EGF-like and two 


0238 








foflistatin-like domains 1 [Rsapiens] 
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TABLE 11 A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 11. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number 
100610 19864J 



100674 21517.2 



108559 41469.9 

100721 19818.1 

100748 41861.1 

100750 15759J 



100751 24700J 



100760 1334.7 
100775 18179.3 



Accession 

AW1 61357 A1879062 AI928938 AW161097 AW161 1 67 BE314465 AA351715 F07096 AA179034 F08510 RJ0653 AI936671 
AA476718 AW772454 AI807703 R44253 AA976667 AI985186 A1650254 H38942 R84829 AA018724 AA001000 H85934 
AA019126 H85609 AA017000 AA339355 AW950556 D51397 AA213981 BE548002 A1056359 AA0O156O AW9521 13 
AA317769 A1857477 AI857475 AW249771 AW162661 H38943 AA018628 R85885 AI984613 AI934765 A1796172 AW157488 
AI929191 R85523 D51221 D53851 H85610 AI749674 F21582 AA323145 AA019127 AA687444 T06745 AI699293 H29532 
AA214029 AA223656 NM 016834 X14474 R19697 H09695 R17455 R13812 R19056 AI681231 A159020O R37671 AA861828 
AI990023 AI935669 AW005821 AA324581 H17335 R37659 R42802 R46242 R60936 R59731 H28993 AM79907 R44570 
AI890696 AA308884 AA5O7078 R41274 AI365507 T16348 A1560453 F03259 F04722 T16312 AA016081 AW073061 
BE314824 W28930 R44098 R51045 

AW403342 AW248986 BE561709 AA357312 BE311834 BE389496 BE294887 AW732696 BE047868 AI702383 BE019155 
AI702367 BE408966 BE280458 BE313759 BE513492 BE535404 BE280258 AC005263 NM_007165 L21990 AW732711 
AI564920 AW249094 BE265365 AW607186 AW607346 BE005217 H2721 1 U46230 BE260066 BE207043 BE546782 
AW248659 
AA08522BAA085161 

L4O904 NM_005037 X90563 AB005526 H21596 AA088517 
X06096 X05826 

BE157260 BE157265 R481 18 H43827 217877 AW379070 AW291778 M20605 J03253 M14206 V00568 AI860465 AW296022 
M13930 AL047400 J00120 BE018476 AW675223 T26980 F06694 R22709 R24720 H22753 AI9031 00 AI903094 AW937823 
X00364 D10493 K01904 K01 906 K00535 L00058 AA410662 AW384760 AA304930 AI680985 X00198 H58025 AW998901 
AV653447 N31654 AW610357 AW610369 AW862480 BE223010 AW384172 AW384219 AW384171 AW384218 AA298522 
BE140421 AW945162 AW75171 1 AA514409 AW747912 AI214214 W87741 AA972406 AA554513 BE3Q2087 A1249030 
AA477850 AV653129 AI281360 AJ2741 10 W87861 AA641366 X66258 AI051600 AA877139 AA527483 AA857219 AI250782 
AA625531 AA807892 AI278811 AI224033 H24033 AA593396AW1 29709 R45453 N22772 AA235530 T29737 AI01 6409 
AI688907 AA568370 AA722760 AI539329 AA550843 AW674698 AI538452 AI538453 AI337957 AA477744 AA464600 
AI140319 AW949294 A1339781 AI828736 AA923634 AA344094 AI278350 AA975567 AA908416 AA857170 AW023520 
R43413 R48004 F02958 A1989439 R1 1207 AA737307 D10493 AW950652 AI093842 AI474024 AA703369 R1 1264 M13930 
M13930 M13930 M13930 M13930 J00120 M13930 M13930 X00364 J00120 R19507 AA639812 
N32759 N29730 N30831 N32604 N31955 AI206390 H87574 R23494 AI186215 N30036 AI741512 J00117 NMJX»737 
AI453626 AA330974 AI188729 AI188604 AJ188964 N30276 AI188947 AI188830 A1188303 AI200457 AI219166 AI192459 
AI183280 AI189275 AI188639 A1186353 AI189616 AI184224 AI130720 AJ 188454 AI188391 AI148857 AI192447 AI209155 
AI190013 AI206355 AI188721 AI189429 All 89364 Al 1 66330 AI43 1595 All 89595 AI188781 AI148647 A1200022 AI221552 
AI220923 All 88728 AA233034 AI189807 AI189641 AI219044 AI148774 AI200658 W71989 AI207360 AI188824 AI200559 
AI200270 AA644163 AH99943 Al 151301 AI189555 AI262724 AU48590 AI148695 AI126906 AI149163 KG31B3 K03189 
A1189B42 AI221014 N30608 AM 86465 AI220865 AI188498 AI138226 AI189968 AI221019 AI138197 AI149426 AI148904 
AI186218 AI188348 AI160579 AI198460 AI149039 AI160936 AI219055 AI184784 AI221580 AM 61082 AI160814 AI123896 
AI417614 AI126101 AM88872 AI149571 AI168533 AI149072 AI149467 AI131286 N30684 AI160705 AI160692 AI149559 
AI273580 AI189442 AI138448 AI149591 N27302 AA4 009 1 0 Al 138431 Al 138435 AJ 128407 N30216 AJ128296 AI219589 
AI188492 AI149447 AI168482 H95374 AI219009 N31616 A1276216 N32233 AI291937 N30741 AI1886B9 N27111 R23214 
AI221605 AI184348 AI200375 H94451 N26397 AI871881 AA232905 N30833 AI220780 H94446 N30822 H87464 R68815 
N3Q290 AI128424 H12587 T47334 H87631 H87156 AI219133 AI868741 AA330859 H86993 AA330413 H93656 N30817 
T90191 H93868 AC00054 H95207 T47316 H95381 T49170 R00880T49171 N27381 H94107 R63352 TB5053 AW451899 
H95142N30313 K94015H86987 T28278 N29701 C18834 AA331267 AA330939 AI654493 N27073 N29831 R68113N30758 
R26086 N32108 H95135 AA330414 AA330978 AI219422 All 89453 All 99951 X00264 NM_000894 AA371909 AA063496 
T29543 AA371971 AA372026 AA371978 AA371346 AI051683 AI166418 AI220659 AI189068 AI219266 AI186552 AH 68715 
AI149156 

AW794626 M27126 M27014 

J05581 M61 170 T27692 M34C88 M34089 AW860335 AW579047 AW610437 AW610386 AW610422 AW610473 AW579078 
AW604897 AW860163 AW579067 AW862410 AI816584 AW177757 AW602769 AI909790 AW860331 AI909787 AI90981 1 
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100800 24735J 



100818 19604.3 

100881 458J27 

100885 12707.3 

100898 8542.1 



102459 
126126 
102620 
102673 
102675 
102753 
102799 
127034 



3556.1 

1630017.1 

16821.37 

24986.6 

5145.4 

2226.1 

34624.4 

51148.2 



50 103522 21640.1 



127071 188097.1 
126456 291965.1 
119388 1762256.1 
126856 20669.1 



103996 224545.1 



113213 23798.1 



134947 844579J 
129311 16078.1 



AJ909813 AWB45083 AIS05920 AW387919 BE1 40766 AI909279 AW369405 AA429321 AA429320AA367451 AA847972 
AW001 137 A1567905 T84561 A1631295 AA151351 H02932 AI884519 AA367457 AW369421 A1678846 AW391 803 AI61 0869 
AW1 92838 AI922289 AJ952140 AI910233 AI479474 AW001395 AA488073 AI985760 AW130017 AI858369 AA627845 
AW081 805 AA158865 AI624443 AA344985 AA569793 R72486 AI5B9329 AI903204 A1269893 AA541284 AI279932 AA149270 
A1697120 AA729146 A1589353 AA480067 AI923310 AA530908 AI275395 AA425062 AA580280 AA889527 AA158866 
AW131341 AA573028 AA877326 T29335 AW951288 H04235 AA099243 AA994659 AI659618 AA887919 AI299297 
AW001 116 AW263844 AI270578 AA970828 AW572126 M775299 AW369449 AW369398 AW369452 AI933677 AI870710 
AI09291 1 AB82464 AI497674 AA937026 AA885865 138597 AA908325 AW369432 AWQ26623 AA627778 A1264942 
AA932409 AI187328 A1672970 AI886098 AW440471 AW138860 A1866858 AI802528 AI926172 AW243914 AI933690 
AA9961 14 AA536189 AW009937 AI918060 A1270379 AI973169 AW175638 AW369413 

NM.006227 126232 R50649 AU077024 AL008726 AA41 1079 R35151 BE278153 BE278139 AI459777 R88036 Z43210 
F07326 AF052157 R17844 BE615476 T82160 R71985 H21963 AA299158 AW368246 R48123 R50628 R70441 H27245 
H72015 R72345 R39392 AJ909738 BE612778 BE613234 052116 D52136 D52132 D52067 D51922 D51995 D51905 N34249 
N25459 AA464436 AA297350 AA297466 R81736 HQ2737 AW5B2505 R27523 AI834241 AW130867 W72668 W76426 
AA358363 R5Q262 AW473860 H52335 H43953 H21964 T39505 AI887517 AW156925 AW839850 H02628 AW007705 
AI561008 F22392 R71279 AA995433 R50725 W24462 R71931 AA464437 AW591731 R25667 R52695 R50810 AI560805 
AI089266 H68386 H41353 H28590 AW001860 AI141623 AA250773 AI284778 AW511412 AW083975 AA130377 AW026047 
R50551 R81494 AI357668 AI078272 F32666 F36981 AW304865 H43906 AA931068 R48010 A1540217 AI017339 AI291812 
AI741954 AA458490 AI088378 AA298764 H6116S AA358362 AA298725 AA298515 AA464148 AA443538 R43046 AA084314 
T40641 T47608 T48940 A1082477 AW470145 N92284 AI758958 AA298512 AA284586 AI597777 AA480277 AJ 932559 
A1869081 AA476615 AA503651 AI656024 AW168522 A1682051 A1689106 AI274592 A1520917 BE258916 BE615861 
BE280282 R53386 BE278255 BE278398 T47607 AA477662 K68385 

100817 19648.1 L34355 L46810 NNUJ00023 U08895 AA424260 AI097272 AA424162 N79764 F19290 F25278 AI479385 
AA460662 AA432059 AW016935 F25770 F32549 F36677 F33016 F35992 F36010 AW172497 AA835076 F28727 AA21 1643 
AA453282 

U79251 AA843851 R38201 R66461 R44908 M683289 H17477 R37364 R52832 AW298336 AA351391 NM.002545 L34774 
AA296886 AW967001 T28889 R13451 T77331 AL1 19196 AL1 18830 H08459 AW892812 AW905838 H17585 R52878 
BE561958 BE561728 BE397612 BE514391 BE269037 BE514207 BE562381 BE514256 BE514403 BE514250 BE397832 
BE269598 BE559865 BE396881 BE560031 BE514199 BE560037 BE560454 
X07881 NM.006249 X07637 AA376715 AA376677 X07715 X07704 S80916 

BE387614 R51501 AA199714 AW674779 F08178 BE269071 AA376313 H08264 AA380420 H18785 AL042151 BE277758 

BE267438 NM.005850 L35013 BE540833 BE390902 BE391494 BE277459 BE385592 BE390612 BE384263 BE387779 

BE388647 BE537373 BE547158 AW409585 AW374033 AW6Q2185 AA355725 AW577548 AW935015 AW935160 W40232 

AW938647 AW374332 AA434040 BE293488 AL138361 BE560260 AI745075 AA317980 AW949382 AI83431 1 AI653582 

AI831042 AI361878 AA618606 AA729052 AI424969 AA199715 AW769374 AI82B422 AW044307 AI862816 AI203583 

AW084461 AW514655 AA831883 AA290672 AA831286 AA578510 AW089965 AW150746 AA292743 H22232 A1469275 

AW439312 AA292744 AW471443 AI473989 AA593336 AA464070 AI678937 AW069451 AA970763 AA610480 AA593328 

AA464009 AA768985 A1298928 AA436600 AA464718 AA699361 D61482 D55935 AI369591 AA470695 AI809135 AA640627 

AI568446 R51502 W45467 AI655316 AA463934 AW168609 AW518663 BE045525 Z41251 AI868091 AA908160 AJ026697 

AI886259 AI612932 AA215437 AI956014 BE541087 BE255652 BE265878 BE394102 W27502 

U48936 U36592 X87160 NM.001039 AL036606 AL036420 U35630 AW298574 

W80551 M85370 

AA976427 U66052 

AI457548U72509 

U72512 T93357 R31335 F18090 

L32961 NM 000663 U80226 S75578 AA425061 AA429317 A18151 43 AA910669 AI286022 AI286019 
U88896 U88898 AA916056 T03285 AI341594 AI359534 AI634031 U88897 

BE397750 AA232171 BE562900 BE384894 BE242228 BE206819 BE261742 AA296468 AW959763 BE276164 BE264109 
BE392626 BE256735 AA301453 N55872 H01676 AA292746 AA427485 AA496400 AA352389 

Y10518 Y10514 Z83935 Y10508 AK000055 Y10519 AI142012 AI681 175 BE222219 AA890586 BE504347 BE328064 N 63044 

N51226 A1151248 AI521996 AI924777 AW375954 AI860275 W00549 AI742673 AW612288 AI763062 AA632510 At087347 

AI088070 AI214349 AA890297 AI494156 AI698598 AA631658 AA504593 AA860733 AI266761 AW663214 AW771231 

AA639610 AI769806 AI769746 AW0 14326 AJ 28861 1 

AA250806AA459220 

AA429212 W00881 

T88798 R92430 

AI084125 AI083773 AI479687 AI939609 A1968662 AF129507 NMJH3282 AW971840 AW298508 AA744240 AA811217 
AA827671 AA81 1055 AA806567 AA488977 AA908902 AI637637 AA927056 AI870139 AW340492 AA488755 AA129794 
AA306523 AA354253 BE256277 AC053467 AW962084 

AA321355 AW964592 R23284 H73883 R23382 N47914 C01377 H04668 AW606248 R34447 AA847136 AI684489 AI523112 
AW044269 AI379138 N29366 AA761543 N79248 AA960845 AA768316 AI147926 AI718599 AI880620 R67467 AI216016 
AI738663 H04648 

NM 001395 Y08302 AI434619 AI470328 AI261807 AW024965 AI806537 AI830549 AI640337 AI219065 AW271700 
AW028488 AI133339 AI859205 R51 175 U87167 BE379324 BE392008 AA340819 AA3431 10 T57275 D59164 AW299312 
A1434422 AI936390 AW024975 R4Q262 

AW269126 R09430 T56590 AJ367247 A1253132 BE464248 T58658 AW207785 T58607 
R51 194 AI732276 R53587 A1820697 

AK000526 BE550084 W30689 AW271 859 AA41 1456 Ai34l551 AA242990 AA243027 H87046 D20360 Af 184053 AA146956 
AT721023 AT718944 AA146955 F18215 AA90389O AI700355 AI075430 AA411584 AA878210 AI476760 AW945637 AA630596 
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AA431522 AA301989 AI909058 D12149 N41960 BE222214 AA609922 AA828176 AA393359 AA398693 AW024956 
BE467B05 AW298623 AW264085 AI024454 A1024719 AI431927 T55087 AI61 1014 T54920 AA131253 A1436344 

114427 9724_2 AA017176 AI359979 AA047836 AA01 7063 AA0 1 6303 AA001 545 

114569 110077J AA063315 AA063316 

100106 1562L-5 AFO15910 

100515 342 1 AA305746 D90187T63943AW951 154 T29182 A1734941 013264 AI299239Z18812 AW299859W24476 AA933064 

AA489759 

100531 46038 1 AW888554 AW607282 AA319986 M28590 

100545 22955.11 M55405 AW752552 

100574 17320 2 AA326895 M 10036 NMJXXJ365 N84665 H69414 N84657 AA380453 AA329743 AA357367 AA188770 AA376532 AA353653 

AA158953 AA083176 BE537313 AA181433 D53373 R57376 AA206698 R 14807 H18899 H1 1191 H93892 R25593 T61134 
N93285AA083081 AA831789H13137 AA497014 AA079330 AA182861 H13138W47161 R62913 AA687089AA211112 
AA429237 AL035923 AA 100070 AW392898 AI565433 AA866006 AA214002 AW392865 N79454 AA197181 AI680371 
AA176501 AA737967 AI089225 F34874 AW571437 AI620620 AA573489 AA423816 AA164917 AA458455 T47072 AI569087 
AI261656 AA730919 AI633441 AW195182 AI351622 AW243465 A1872649 AI359227 AA987941 AI693770 T47073 AW779948 
AW51O580 AI635626 AW627601 AA864326 AA953578 AI341418 BE222853 AI241963 AI094663 AA928380 AA493373 
AW043762 AI377783 AW958987 BE619760 AA385240 BE277975 BE280095 AW631 443 AA581048 BE61871 5 BE299610 
C14874 BE559858 BE378455 BE618290 BE544585 AI525575 BE548897 BE2671 10 AA804738 BE269821 AA918133 
BE277647 AA599947 BE280735 BE390239 N74150T12504 AI208197 AW955527 AA1 13897 N40081 H73835 H70393 
AI434041 W22950 AI192661 BE264461 W26486 AA626424 AA196694 T69209 AA857976 AI540287 AA410599 AA864287 
AW950564 AA013320T49283 AI541438 AW804703 AA335534 AA335659 BE562269 BE618802 BE277850 BE546413 
BE280994 AA204813 BE561694 BE543524 BE253647 AW001452 W191 16 BE542508 AA205894 BE254875 BE270033 
A1525906 BE251792 AA975700 BE272138 AW607671 N87686 M10036 BE515060 BE298607 AI745178 U47924 H03193 

100627 tigr_HT279B Z25424 

100756 tigr_HT376B M88357 

100768 tigr_HT3846 L29141 M69180 M81105 

100813 ligr_HT4265 L33999 

100836 tigr_HT4383 U04688 

100855 tigr_HT4504 U09806 

102104 entrezJJ12139 U12139 

125091 genbanKJ91518 T91518 

100929 ligr_HT688 X65561 

125147 _entrez_W38150 W38150 

102354 entrez_U38268 U38268 

102491 entrez_U5101O U51010 

102636 entrez U67092 U67092 

118769 genbanleN74496N74496 

101046 entrezj<01160 K01160 

101057 entrezj<03430 K03430 

108334 genbanJeAA070473 AA070473 

108417 483241J AA070853 AA075749 AA075716 

108441 genbanleAA079079 AA079079 

108786 genbanR_AA128999 AA128999 

101655 entrez_M60299 M60299 

101697 entrez_M64358 M64358 

117437 genbanleN27645 N27645 

101798 entreZLM85220 M85220 

101909 entrez_S69265 S69265 

103508 entrez_Y10141 Y10141 

103575 entrez_Z26256 226256 

119332 genbanK_T54095 T54095 

112161 genbanKJU8295 R48295 

119564 NOT_FOUND_entrez_W38206 W38206 

114376 NOTJKDUNDjmtrezJMCSF GMCSF 

100478 tigr_HT1067 M22406 

100547 tigr_HT2219 M57417 

100564 tigr.HT2324 Z11585 
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TABLE 12: shows genes, including expression sequence tags, that are down-regulated in 
prostate tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos 
HuOl GeneChip array. Shown are the ratios of "average" normal prostate to "average" 
prostate cancer tissues. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R 1 : Background subtracted normal prostate : prostate tumor tissue 



Pkey ExAccn UnigenelD Unigene Title 



R1 



100522 HG1763-HT1780 Prolactin-lnduced Protein 17.4 

130803 M81650 Hs.1968 semenogeDn I 16.765 

118068 N53943 Hs.13743 ESTs 13525 

114251 Z39898 Hs.21948 ESTs ~~ 12.7 

112134 R46025 Hs.7413 ESTs "8.735 

101436 M20642 Hs.158295 Human alkali myosin light chain 3 mRNA; complete cds 8.175 

104028 AA361094 Hs.221128 ESTs 8.15 

108944 AA149204 Hs.175783 ESTs; Highly similar to growth anest inducible gene product (Rsapiens) 7.535 

103838 AA174173 Hs.12622 ESTs 7512 

120469 AA251741 Hs.25882 DKFZP586M 1824 protein 7.175 

110279 H29231 Hs.27384 ESTs 6.701 

127472 AA761378 Hs.192013 ESTs 6.642 

133301 N35229 Hs.7037 patiid (mouse) homolog; pallidin . 6.411 

102457 U48B07 Hs.2359 dual specificity phosphatase 4 6.395 

114011 W90385 Hs:i5082 ESTs 6.15 

101249 L33881 Hs.1904 protein kinase C; rata 6 

123265 AA491209 Hs.1 05265 ESTs; Weakly similar to reverse transcriptase [Mjnuscuius] 6 

119322 T49655 Hs.241569 ESTs; Modiy smlr to !! ALU SUBFAMILY SQ WARNING ENTRY II [H.sapiens] 5.95 

101673 M61906 Hs.6241 phosphoinosilide-3-kinase; regulatory subunit; polypeptide 1 (p85 alpha) 5.925 

115586 AA399218 Hs.92423 ESTs 5.7 

120590 AA281780 Hs.111441 ESTs; Weakly similar to similar to Kruppei-Kke zinc finger protein [Cetegans] 5.7 

109748 F10192 Hs.248323 TubuRn; alpha; brain-specific 5.625 

134727 X80507 Hs.8939 yes-associated protein 65 kDa 5.5 

129171 AA234048 Hs.7753 calumenin 5.486 

120390 AA233122 Hs.111460 ESTs; Highly similar to multifunctional caidunVcalrnodufin-dependent protein 

kinase II delta2 isoform [H.sapiens] 5.4 

131699 R68657 Hs.90421 ESTs; Modly smlr to 11 ALU SUBFAMILY SX WARNING ENTRY U [H^apiens] 5.279 

104490 N71503 Hs.43087 ESTs; Weakly similar to dysferfin (H.sapiens] 5.266 . 

102124 U14528 Hs.29981 solute carrier family 26 (sulfate transporter); member 2 5.151 

109280 AA196635 Hs.86081 ESTs 5.134 

109707 F09739 Hs.185701 Homo sapiens mRNA full length insert cDNA done EUROIMAGE 21920 5.075 

108087 AA045709 Hs.40545 ESTs 5.075 

135006 M21665 Hs.929 myosin; heavy polypeptide 7; cardiac musde; beta 5.055 

119182 R80664 Hs.77067 ESTs - 5.033 

129806 R62444 Hs.173373 KIAA0931 protein 4.675 

101435 M20543 Hs.1288 actin; alpha 1; skeletal musde 4.626 

125954 R93943 yt72c12 j1 Soares retina N2b4HR Homo sapiens cDNA done IMAGE275735 5', 4.6 

113989 W87544 Hs.221184 ESTs 4.559 

104432 J03460 Hs.99949 prolactin-induced protein 4.451 

112326 R56068 Hs.4268 ESTs 4.45 

119063 R16833 Hs.53106 ESTs; Weakly similar to II ALU SUBFAMILY J WARNING ENTRY II [H.sapiens] 4.45 

130376 R40873 Hs.155174 KIAA0432 gene product 4.301 

122484 AA448286 Hs.98074 ESTs; Highly similar to atrophin-1 interacting protein 4 (Rsapiens) A 2 

104142 AA447006 ESTs; Moderately similar to U ALU SUBFAMILY SQ WARNING 4.175 

129413 N32787 Hs.11123 ESTs; Moderately similar to hypothetical protein 2 (H^apiens) 4.1 

103678 284483 Human DNA sequence from PAC 46H23, BRCA2 gene region chromosome 13q12«134.05 

114266 Z40186 Hs.26409 ESTs 4.05 

115206 AA262491 Hs.186572 ESTs 4.048 

123723 AA609749 Hs.1 12759 ESTs; Highly similar to unknown protein (Rjwrvegicus] 4.041 

129130 K97993 Hs.172788 ESTs; Weakfy similar to KIAA0512 protein [Rsapiens] 4.028 
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120217 Z41078 Hs.66035 ESTs 4.028 
108536 AA084524 zn19d8.s1 Stratagene neuroepitheBum NT2RAMI 937234 Homo sapiens cONA 4.023 

134460 AA4O0030 Hs.8360 ESTs; Weakly similar to !! ALU CLASS 8 WARNING ENTRY !! [Rsapiens] 3525 

120418 AA238010 Hs56613 Homo sapiens mRNA; cONA DKFZp586F1323 (from clone DKFZp588F1323) 3.91 

132783 N74897 Hs.5683 DEAD/H (Asp-Giu-AJa- Asp/His) box polypeptide 15 3.889 

125052 T80174 Hs522779 ESTs;M(xieratelysintotosirTularto^ 3.85 
108600 AA099585 Hs.41175 ESTs 3.833 
103099 X61 100 Hs.8248 NADH dehydrogenase (ubiquinone) Fe-S protein 1 (75kD) (NADH-coenzyme 3.818 
134948 H06773 Hs.93850 protein kinase; AMP-activated; gamma 2 non-catalytic subunrt 3.792 
120511 AA258144 Hs.221576 ESTs 3.779 
111861 R37460 Hs55231 ESTs 3.768 
113966 W86600 Hs.9842 ESTs 3.75 
131649 AA481254 Hs.30120 ESTs 3.708 
129775 R94659 Hs.12420 ESTs 3.707 
110191 H20568 Hs57182 phosphoJipase A2-activating protein 3.7 
112678 R87160 Hs.33665 ESTs 3.7 
127115 AA375791 Hs,131694 ESTs 3.674 
132892 W92797 Hs^9378 DKFZP434G162 protein 3.653 
115023 AA252079 Hs.63931 dachshund (Drosophila) homolog 3.625 
114932 AA242751 Hs.16218 KIAA0903 protein 3.62 
106865 AA487228 Hs.19479 ESTs 3.614 
134480 AA024664 Hs.83916 NADH dehydrogenase (ubiquinone) 1 alpha subcomptex; 5 (13kD; B13) 3.613 
124780 R42493 Hs520839 ESTs 3.6 
130631 AA025399 Hs.169737 ESTs 3.592 
134154 AA211320 Hs.79404 neuron-spacific protein 3.568 
104160 AA455706 Hs.99722 ESTs; Weakly similar to 78 KD GLUCOSE REGULATED PROTEIN 

PRECURSOR 3559 

105524 AA258158 Hs52153 ESTs; Weakly similar to KIAA0352 [Rsapiens] 3342 

110168 H19673 Hs.176586 ESTs 3.525 

109480 AA233299 Hs.72158 ESTs 3.522 

109585 F02367 Hs57252 ESTs 3.5 

115134 AA257107 Hs.194331 ESTs 3.5 

116083 AA455653 Hs.44581 ESTs; Weakly similar to HEAT SHOCK 70 KD PROTEIN 6 [H.sapiens] 3.459 

120524 AA261852 Hs.192905 ESTs 3.45 

116932 H74330 Hs.150000 ESTs 3.425 

130746 AA256976 Hs.18800 ESTs; Weakly similar to KIAA0579 protein (H.sapiens) 3.42 

107513 X05451 Hs.158295 Human alkali myosin light chain 3 mRNA; complete cds 3.417 

118641 N70298 Hs.49829 ESTs 3.407 

126584 AI028384 Hs.127331 ESTs 3.399 

105134 AA159953 Hs52895 ESTs; Weakly similar to aryteulfatase B precursor [H.sapiens] 3.325 

123502 AA600116 Hs.112526 ESTs 3.318 

132389 N50866 Hs.47135 ESTs 3.317 

105691 AA287097 Hs.75356 transcription factor 4 3.315 

131505 H85897 Hs57755 ESTs 3.309 

120775 AA342104 Hs.96777 EST 3.3 

105579 AA278824 Hs.19218 ESTs 3595 

128190 AA946876 Hs.148376 ESTs 3592 

100819 HG4020-HT4290 Transglutaminase 3588 

130217 D29956 Hs.152818 ubiquitin specific pfotease 8 3273 

130068 AA608903 Hs.106220 KIAA0336 gene product 3569 

134719 L07515 Hs.89232 chromobox homolog 5 (Drosophila HP1 alpha) 3566 

1 10277 H29209 Hs.151231 ESTs; Highly simDar to FYVE finger-containing phospholnosftide kinase (M.musculus] 356 

127354 AM18880 Hs.185797 ESTs " 3512 

129173 R60523 Hs.109087 ESTs 3.197 

127464 AA970504 Hs.146103 ESTs 3.179 

124923 R94500 Hs.108046 ESTs • 3.175 

122465 AA448164 Hs.99153 ESTs; Highly similar to CGI-73 protein [Rsapiens] 3.151 

122027 AA431302 Hs.98721 EST; Weakly similar to N-copine (H^apiens] 3.151 

103329 X85134 Hs.72984 retinobfastoma-binding protein 5 3.15 

129937 M95767 Hs.135578 chitobiase; di-N-acetyl- 3.15 

134197 AA057341 Hs.87889 helicase-moi 3.15 

107764 AA018219 Hs526923 ESTs 3.125 

121775 AA421773 Hs.161008 ESTs 3.125 

114768 AA149007 Hs.182339 Ets homologous factor 3.12 

132381 N48818 Hs.46884 ESTs 3.11 

123105 AA485973 Hs.143947 ESTs 3.104 

121176 AA400080 Hs.97774 ESTs 3.1 

125053 T80620 Hs.186473 ESTs 3.075 
105909 AA401739 Hs.5111 ESTs 3.066 
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119767 W72562 Hs.58119 ESTs 3.057 

115776 AA424038 Hs5B197 ESTs 3.056 

111713 R22988 Hs.220950 ESTs 3.05 

115301 AA280047 Hs.43948 ESTs 3.05 

5 118448 N66412 Hs.49189 ESTs 3 

106586 AA456598 Hs.256269 ESTs 2.995 

110415 H48239 Hs.29739 ESTs; Weakly similar to RAS-RELATED PROTEIN RAB-3A [H.saplensJ 2.979 

105173 AA182030 Hs.8364 ESTs 2578 

101 102 L07594 Hs.79059 transforming growth factor, beta receptor III (betagtycan; 300kD) 2.976 

10 110543 H58383 Hs.258544 ESTs 2.976 

125593 R24464 Hs202949 KIAA1 102 protein 2.964 

100824 HG4058-HT4328 Oncogene Amll-EvM .Fusion Activated 2.957 

106822 AA481068 Hs.31835 ESTs 2.95 

131963 D11930 Hs.3592 ESTs 255 

15 111221 N68869 Hs.15119 ESTs 2.936 

113620 T93795 Hs.17252 EST 2.917 

105220 AA210695 Hs.17212 ESTs 2.917 

123234 AA490227 Hs.105252 ESTs 2.904 

125250 W87465 Hs.222926 ESTs; Weakly similar to 020922 [Celegans] 2.9 

20 116196 AA465160 Hs.63386 ESTs 25 

122100 AA432243 Hs.41086 ESTs; Weakly similar to OXYSTEROL-BINDING PROTEIN (H.sapiensJ 2.896 

111712 R22905 Hs.1 13716 ESTs 2595 

126589 W78107 Hs.187698 ESTs; Weakly similar to Yert40wp [S.cerevisiae] " 2.895 

111132 N64378 Hs.13149 ESTs; Highly similar to unknown function [H.sapiens] 2.894 

25 115307 AA280300 Hs.191346 ESTs 2.886 

108989 AA152263 Hs.18827 KIAA0849 protein 2.883 

129486 H03686 Hs.220689 Ras-GTPase-activating protein SH3-domain-binding protein 2.879 

119805 W73788 Hs.43213 ESTs 2.875 

125721 R59881 Hs.7503 ESTs 2.871 

30 103704 AA028171 Hs.153688 ESTs 2568 

128420 AI088155 Hs.14146 ESTs; Weakly slmBar to unknown [H sapiens] 2.866 

120571 AA280738 Hs.128679 ESTs 2.863 

123059 AA482019 Hs.238202 EST 2.86 

129462 D84239 Hs.1 11732 IgG Fc binding protein 2.856 

35 125166 W45491 Hs.172609 nucfeobindin 1 2.854 

125992 W01626 za36e07.r1 Scares fetal liver spleen 1NFLS Homo sapiens cDNA clone 2552 

109431 AA227972 Hs.43635 ESTs 2.85 

105077 AA142919 Hs.5558 ESTs 2.847 

131388 R34531 Hs.92200 KIAA0480 gene product 2546 

40 121080 AA398720 Hs.177953 ESTs 2.838 

112575 R73816 Hs.17385 ESTs 2.836 

130244 R26206 Hs.153293 K1AA0701 protein 2525 

134698 AA427783 Hs.77910 3-hydroxy-3-methy^lutaryKIoenzyme A synthase 1 (soluble) 2516 

116355 AA504356 Hs.88650 ESTs 2513 

45 115316 AA280627 Hs.57846 ESTs 2.806 

129677 U48736 Hs.198891 serine/threonine-protein kinase PRP4 homotog 25 

130971 H20332 Hs28707 signal sequence receptor, gamma (trarelocOT^ssc^ted protein garnma) 2.799 

115054 AA252863 Hs57729 ESTs 2.795 

130285 AA063546 Hsi02968 ESTs 2.792 

50 124308 H93575 Hs.227146 Homo sapiens mRNA; cDNA DKFZp564J 142 (from clone DKFZp564J 142) 2.783 

125502 AA732329 Hs.191959 ESTs 2.778 

114800 AA159825 Hs.131887 ESTs; Weakly similar to ORF YNI_227c [S.cerevisiae] 2.768 

128625 AA242816 Hs.102652 ESTs; WeaWy similar to KIAA0437 [H.sapiens] - 2.766 

130159 H51098 Hs.151310 PDZ domain protein (Orosophila inaD-Cke) 2.75 

55 107127 AA620504 H^22119 ESTs 2.742 

113547 T90746 Hs.15233 ESTs 2.734 

104639 AA004622 Hs.18214 ESTs 2727 

127609 AA622559 Hs.150318 ESTs 2.726 

106922 AA490964 Hs.10056 ESTs 2.725 

60 124825 R52088 yg85c3.s1 Soares infant brain 1NIB Homo sapiens cDNA clone 2.725 

124333 H98683 Hs.154054 ESTs 2.708 

117634 N36421 Hs.107854 ESTs; WeaWy similar to SODIUM- AND CHLORIDE-OEPENDENTGLYCINE 

TRANSP 2706 

101609 M54927 Hs.1787 proteolipid protein 1 (Pelfcaeus-Merzbacher disease; spastic paraplegia 2; 

65 uncomplicated) -2.704 

117142 H96908 Hs.42251 ESTs 2.7 

112602 R79147 Hs.203365 ESTs 2.695 

106828 AA481505 Hs.13797 ESTs 2.68 

124377 N25996 Hs.179833 ESTs 2575 
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101026 J04970 carboxypeptidase M 2.675 

124560 N66393 * Hs.102754 ESTs • 2.675 

124066 H02494 Hs.101615 ESTs 2.671 

130281 R12777. Hs.15395 ESTs; Weakly similar to ARGINYL-TRNA SYNTHETASE (H-sapiens] 2.66 

110949 N49602 Hs.13308 ESTs 2.65 

111031 N54839 Hs.221085 ESTs; Highly simBar to mediator [H.sapiens] 2.633 

121770 AA421714 Hs.11469 KIAA0896 protein 2.63 

134132 U32519 Hs.220689 Ras-GTPase-adivating protein SH3-domain-binding protein 2.626 

112424 R62452 Hs.191265 ESTs 2.625 

122544 AA451679 Hs.194410 ESTs 2.625 

134425 X90568 Hs.172004 titin 2.624 

111114 N63391 Hs.9238 ESTs 2.619 

116119 AA459242 Hs.44445 ESTs; Weakly similar to Ketch motif containing protein [Ksapiens] 2.615 

112079 R44164 Hs^3014 ESTs 2.6 

123033 AA481271 Hs.193945 ESTs 2.591 

124196 H52617 Hs.144167 ESTs 2386 

125873 H14437 yf25a04.fi Soares breast 3NbHBst Homo sapiens cDNA done 238 

117684 N40184 Hs.45050 ESTs 2.575 

134938 D30037 Hs.168326 phosphoUdytinosctol transfer protein; beta 2375 

131822 AA215647 Hs.200332 ESTs 2368 

135185 U71203 Hs.96038 Ric (Drosophila)-like; expressed in many tissues 2364 

117690 N40467 Hs.93834 ESTs 2357 

118807 N78582 Hs.50732 protein kinase; AMP-activated; beta 2 non-catalytic subunit 2.552 

121369 AA405657 Hs.128791 Human DNA sequence from done 967N21 on chromosome 20p12.3-13. Contains 2.55 

114860 AA235112 Hs.1 06227 ESTs; Moderately similar to similar to murine RNA-binding protein [H^apiens] 2349 

121857 AA426017 Hs.62694 ESTs; Highly similar to DNA-REPAJR PROTEIN COMPLEMENTING 2.548 

110190 H20560 Hs.244624 ESTs 2.548 

132573 AA045333 Hs31743 ESTs; Weakly similar to !! ALU SUBFAMILY SB2 WARNING ENTRY !! [Haptens] 2.542 

109706 FQ9729 Hs.12780 ESTs 2337 

135109 AA410391 Hs.94592 klotho 2325 

132810 R37027 Hs3737 KIAA0475 gene product 2325 

124879 R73588 Hs.101533 ESTs 2325 

103840 AA174190 Hs30932 ESTs 2325 

119066 R22196 Hs.34492 ESTs 2319 

114833 AA234362 Hs37310 ESTs; Moderately similar to CGI-66 protein (H.sapiensJ 2307 

112998 T23555 Hs.103288 ESTs 23 

123312 AA496258 Hs.99601 ESTs 2.499 

121873 AA426270 Hs.145696 splicing fador (CCU) 2.491 

123321 AA496884 Hs.23972 ESTs 2.491 

107760 AA018042 Hs.95078 EST 2.483 

102580 U60808 Hs.152981 CDP-diacyfglyceroJ synthase (phosphatidate cyfeJylyftransferase) 1 2.481 

103053 X56741 Hs3947 mel transforming oncogene (derived from cell line NK14)- RAB8 homolog 2.475 

124756 R38100 Hs.106294 ESTs 2.475 

112936 T15665 Hs.6185 ESTs; Weakly similar to BcONA.GH12174 pjtielanogaster] 2.475 

125178 W58202 Hs.125731 ESTs 2.475 

112423 R62447 H<*22123 ESTs 2.471 

123515 AA600323 Hs.1 12535 EST 2.462 

102842 U95020 Hs.21903 caldum channel; voltage-dependent; beta 4 subunit 2.457 

102400 U42390 Hs.171957 triple functional domain (PTPRF interacting) 2.455 

113187 T56056 Hs.9992 ESTs 2.452 

131687 L11066 Hs.3069 heat shock 70kD protein 9B (mortaIin-2) 2.448 

115314 AA280583 Hs.256501 ESTs 2.437 

128211 AI206427 Hs.166707 ESTs; Highly similar to Ran-binding protein 2 [H.sapiens] 2.43 

134281 L11005 Hs.81047 aldehyde oxidase 1 2.425 

115985 AA447709 Hs.132094 ESTs; Moderately simBar to putative transcription fador CA150 [H.sapiens] 2.425 

111348 N90041 Hs.9585 ESTs 2.418 

129430 AA258842 Hs.197877 Homo sapiens done 23777 putaBve transmembrane GTPase mRNA; partial cds 2.418 

133863 C13990 Hs.76930 synudein; alpha (non A4 component of amyloid precursor) 2.417 

111164 N66857 Hs.14808 ESTs; Weakly similar to I! ALU CLASS C WARNING ENTRY !! [H.sapiensJ 2.416 

132143 AA257056 Hs.7972 KIAA0871 protein 2.412 

130330 M55047 Hs.154679 synaptotagmin 1 2.408 

114219 Z39451 Hs.27389 ESTs 2.406 

117101 H94043 Hs.24341 DKFZP5861 141 9 protein 2.403 

125433 AA034325 Hs34320 ESTs 2.4 

111099 N62506 Hs.21958 ESTs 2.4 

120323 AA195405 Hs.1 10347 Homo sapiens mRNA for atyha integrin binding protein 80; partial 2.397 

118624 N69998 Hs.21801 ESTs 2.394 

123570 AA608955 Hs.109653 ESTs 2.389 

123562 AA608893 Hs.190065 ESTs 2.388 
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131546 AA262821 Hs. 26578 musdebGnd prosophUaHike 2.385 

103143 X66141 Hs.75535 myosin; Gght polypeptide 2; regulatory; cardiac: slow 2.384 

123645 AA609310 Hs.188691 ESTs 2.383 

130123 AA001835 Hs.150390 zinc finger protein 262 2.379 

131682 AA428368 Hs.30654 ESTs 2.378 

115909 AA436666 H&59761 ESTs 2.375 

125168 W45574 Hs252497 ESTs 2.372 

123973 C14805 Hs.182151 ESTs 2.361 

135197 U76456 Homo sapiens tissue inhibitor of metaBoproteinase 4 mRNA, complete cds 2.357 

118689 N71545 Hs.184544 ESTs 2.357 

107734 AA016225 Hs.93386 ESTs 2.354 

124590 N69220 Hs.41381 ESTs; Weakly similar to ubiquitin hydroryzing enzyme I [H.sapiens) 2.35 

111163 N66850 Hs.17606 ESTs 2.348 

112349 R58877 Hs22665 ESTs; Moderately similar to dJ83L6.1 |H.sapians] 2.345 

129076 AA262179 Hs.169343 ESTs 2.345 

134238 R81509 Hs.184571 splicing factor; arginine/serine-rich 1 1 2.341 

116766 H13260 Hs.95097 ESTs 2.336 

106331 AA436853 Hs.34795 ESTs 2.333 

129003 AA443752 Hs.10784 ESTs 2.332 

132368 AA599814 Hs.46637 ESTs; Weakly similar to cDNA EST yk289g5.5 comes from this gene [Cetegans] 2.332 

124697 R06273 Hs.186467 ESTs; Modly smlr to II ALU SUBFAMILY J WARNING ENTRY I! |H.sapiens] 2.322 

120273 AA176688 Hs.221139 ESTs 2.313 

127110 AA304993 Hs.100861 ESTs; Weakly similar to p60 katanin [H.sapiens] * 2.307 

105450 AA252621 Hs.93842 ESTs 2.301 

119819 W74371 Hs.58383 ESTs 2297 

102302 U33052 Hs.69171 protein kinase (Mike 2 2.288 

130596 N74353 Hs.16475 ESTs 2282 

114161 Z38904 Hs22385 ESTs; Weakly similar to KIAA0970 protein [H.sapiens] 2278 

130542 U64675 Human sperm membrane protein BS-63 mRNA, complete cds 2277 

104491 N71513 Hs.39328 ESTs 2275 

1 16988 H82527 ys69e12.s1 Soares retina N2b4HR Homo sapiens cDNA clone 2275 

126823 AA370120 Hs.7870 ESTs; Weakly similar to Ylr350wp [S.cerevisiae] 2273 

108800 AA129731 Hs.90424 ESTs 2273 

101310 L41607 Hs.934 glucosamlnyi (N-acetyl) transferase 2; l-branching enzyme 2269 

126842 W19498 Hs21085 ESTs 2.255 

127251 AA936428 Hs.128638 ESTs 2251 

124647 N91947 Hs.125033 ESTs 2249 

127112 AI143906 Hs.125103 ESTs 2247 

101973 S82597 Hs.80120 UDP^-acer^alpha-f>gaiactosamineporypeptide 2246 

120999 AA398302 Hs.127437 ESTs 2245 

130225 AA599583 Hs.15299 HMBA-toducible 2243 

119980 W88678 Hs249247 heterogeneous nuclear protein simaar to rat helix destabilizing protein 2.243 

124222 H61053 Hs222844 ESTs 224 

129199 H90914 Hs.128629 ESTs 2236 

106802 AA479101 Hs.16570 ESTs; Weakly similar to I! ALU SUBFAMILY SQ WARNING ENTRY I! [Rsapiens] 2231 

126160 N90960 Hs247277 ESTs; Weakly similar to transformation-related protein [Rsapiens] 2229 

104627 AA001976 Hs.19603 ESTs 2228 

106474 AA450212 Hs.42484 Homo sapiens mRNA; cONA DKF2jp564C053 (from clone DKF2p564C053) 2226 

113096 T40927 Hs*345 ESTs 2225 

135336 AA452822 Hs.99027 ESTs 2225 

135344 R62976 Hs.168491 ESTs; Moderately simitar to TRFHnteracting ankyrin-related 2225 

126156 AA508354 Hs.1 18448 ESTs; Moderately similar to AKT3 protein kinase [H.sapiens] 2222 

128885 AA397841 Hs.180141 oofflin 2 (muscle) - 2.218 

107900 AA026385 Hs.176600 ESTs; Moderately similar to !! ALU SUBFAMILY SB2 WARNING 2217 

114481 AA033562 Hs.151572 ESTs 2212 

109292 AA199828 Hs.188662 ESTs 2212 

104257 AF006265 Hs.9222 estrogen receptor-binding fragment-associated gene 9 2209 

132932 T15482 Hs.6093 ESTs 2204 

127392 AA262728 Hs.14896 Homo sapiens clone 24590 mRNA sequence 2204 

104641 AA004652 Hs.18564 ESTs 22 

122529 AA449828 Hs.99229 ESTs 2.195 

124307 H93562 Hs.162395 proline synthetase co-transcnbed {bacterial homolog) 2.193 

133601 S95936 Hs.75155 transferrin 2.193 

119904 W85709 Hs.128927 ESTs; Weakly similar to !! ALU SUBFAMILY SP WARNING ENTRY II [Rsapiens] 2.192 

100348 D64109 Hs.4994 transducer of ERBB2; 2 (TOB2) 2.185 

126871 AA351779 Hs200334 ESTs 2.18 

127793 AI298835 Hs.30445 ESTs; Weakly similar to transcription regulator Staf-50 [Rsapiens] 2.178 

105149 AA169253 Hs.8958 ESTs 2.177 

121367 AA405648 zw39g8.s1 Soares_totalJetus_Nb2HF8_9w H sapiens cONA clone IMAGE:772478 2.177 
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111835 R36228 H&25119 ESTs 2.175 

133394 R16759 Hs237225 ribosomaJ protein S5 pseuoogene 1 2.175 

123207 AA489697 Hs.145053 ESTs ■ 2.175 

129801 F11087 H&239666 ESTs 2.175 

5 103393 X94612 Hs.41749 protein kinase; cGMP-dependent; lype II 2.161 

132415 AA043223 Hs.4815 nudix (nudeostde (f^sptaMnked n»le^ 2.157 

106369 AA443828 Hs^5324 ESTs 2.157 

122963 AA478446 Hs.69559 KIAA1 096 protein 2.156 

133473 M19309 Hs.73980 troponin T1 ; skeletal; slow 2.155 

10 134257 C06270 Hs.8078 Homo sapiens mRNA; cDNA DKFZp586L081 (from done DKFZp586L081) 2.155 

135156 AA056O12 Hs.9552 binder of Art Two 2.151 

104055 AA393755 Hs.117211 ESTs; Highly similar to CGI-62 protein (H. sapiens} 2.15 

102313 U33921 HSU33921 Clontech adult lung cDNA Iforary (HL1 158a) Homo sapiens cDNA 2.15 

109788 F10638 Hs.12432 Homo sapiens done 24407 mRNA sequence 2.15 

15 103507 Y10032 Hs. 1 59640 serum/glucocorticoid regulated kinase 2.15 

116000 AA448710 Hs.41327 ESTs 2.15 

105858 AA399164 Hs227676 ESTs; Moderately similar to !! ALU SUBFAMILY SQ 2.137 

103153 X66534 Hs.75295 guanytate cydase 1; soluble; alpha 3 2.137 

126202 AA652238 Hs.199726 ESTs 2.135 

20 115955 AA446121 Hs.44198 Homo sapiens BAC clone RG054D04 from 7q31 2.134 

104164 AA458770 Hs.27023 KIAA0917 protein 2.132 

108692 AA121270 Hs.82960 ESTs . 2.128 

122878 AA465341 Hs.99640 ESTs 2.126 

134771 L13939 Hs.89576 adaptor-related protein complex 1; beta 1 sub unit 2.125 

25 104298 D31120 Hs.40368 adaptor-related protein complex 1;sigma 2 subunit 2.125 

104840 AA039595 Hs.42458 HorriosapiensmRNA;cONADKFZp586C1817(fromdoneDKFZp586C1817) 2.125 

122180 AA435798 Hs.98835 ESTs; Moderately similar to putative ring zinc finger protein 2.125 

131012 H01992 Hs.202949 KIAA1 102 protein 2.125 

134092 H17490 Hs.7905 ESTs; Highly similar to sorting nexin 9 [H.saptens] 2.123 

30 1 18617 N69666 Hs.183413 ESTs; Modtly srnlr to !! ALU SUBFAMILY J WARNING ENTRY !! [H.sapiens] 2.123 

107155 AA6212Q2 Hs.7946 DKFZP586D1 519 protein 2.12 

130925 N71935 Hs.1 69378 multiple PDZ domain protein 2.12 

135167 U63717 Hs.95821 osteodast stimulating factor 1 2.118 

105952 AA405263 Hs.181400 ESTs 2.109 

35 110308 H38148 Hs.32775 ESTs 2.108 

116368 AA521186 Hs.94217 ESTs 2.107 

132939 U76189 Hs.61152 exostoses <multipl8)-Uke 2 2.102 

117881 N50073 Hs.84926 ESTs; Highly simBarto B-IND1 protein [M.musculusl 2.1 

121723 AA419622 Hs.104800 ESTs; Weakly similar to Mouse 195 mRNA; complete cds [M.musculus] 2.096 

40 103500 Y09443 Hs.22580 atkytglyce rone phosphate synthase 2.094 

121429 AA406293 Hs.193498 ESTs 2.093 

134632 AA398710 Hs.174139 chloride channel 3 2.091 

129785 F10980 Hs.184780 ESTs 2.09 

111065 N58193 Hs.18740 ESTs; Weakly similar to 1 -evidence 2.089 

45 114710 AA129931 Hs.79081 protein phosphatase 1; catalytic subunit; gamma isoform 2.083 

132711 N73702 Hsl238927 ESTs 2.083 

133377 R05490 Hs.7239 SEC24 (S. cerevisiae) related gene family; member B 2.079 

124773 R40923 Hs.106604 ESTs 2.078 

117759 N47587 Hs.97345 ESTs; WeaWy similar to TROPOMODUUN |H.sapiens] 2.076 

50 127386 AI457411 Hs,106728 ESTs 2.076 

101167 L15309 Hs.193677 zinc finger protein 141 (done pHZ-44) 2.075 

109597 F02582 Hs.14474 ESTs 2.074 

124390 N29325 Hs.7535 ESTs; Highly similar to COBW-iike placental protein [HrsapiensJ 2.07 

116225 AA478609 Hs.47278 Human Chromosome 16 BAC done CIT987SK-A-735G6 2.07 

55 131243 R16667 Hs.24752 spectrin SH3 domain binding protein 1 2.069 

130557 T90830 Hs.15981 ESTs; Weakly similar to Ene-1 protein ORF2 [H^apiens] 2.067 

134103 D14826 Hs.155924 cAMP responsive element modulator 2.064 

108833 AA131866 Hs.61661 ESTs; Weakly simflar to DY3.6 [Celegans] 2.063 

112286 R53765 Hs.158135 KIAA0981 protein 2.063 

60 125624 AA165411 2q49a01 j1 Stratagene hNT neuron (#937233) Homo sapiens cONA done 2.061 

124612 N72200 Hs.13913 ESTs 2.058 

116335 AA495830 Hs.87013 ESTs 2.057 

112248 R51361 Hs.23423 ESTs 2.056 

115789 AA424754 Hs.43149 ESTs 2.056 

65 107029 AA599219 Hs.187492 ESTs; WeaWy simflar to ALR [Rsapiens) 2.056 

110294 H30270 Hs.165062 ESTs 2.054 

120532 AA262354 Hs.1 86648 ESTs 2.054 

118180 N59249 Hs.48349 ESTs 2.052 

132018 AA293194 Hs.3737 ESTs 2.052 
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132617 AA171913 


Hs.5338 


carbonic anhydrase XII 


2.05 


131526 N36167 


HS28274 


ESTs 


2.05 


113254 T64438 


Hs.1 1449 


DKF2P5640123 protein 


2.05 


122785 AA459978 


Hs.99503 


ESTs 


2.05 


107203 D20426 


Hs^656 


EST 


2.05 


105713 AA291321 


Hs.184319 


ESTs; Moderately similar to KIM 1006 protein [H. sapiens] 


2.046 


129385 D82675 


Hs.1 10950 


Homo sapiens done 25007 mRNA sequence 


2.042 


119116 R43645 


Hs.64595 


DKFZP566E2346 protein 


2.04 


116405 AA600253 


Hs£5601 


ESTs; Highly simBar to host eel! factor 2 [H.sapiens) 


2.04 


125924 AA526849 


Hs.82109 


syndecanl 


2.039 


105599 AA279442 


Hs.143460 


protein kinase C; nu 


2.037 


119741 W70205 


Hs.43670 


kinesin family member 3A 


2.037 


101449 M21494 


Hs.1 18843 


creatine kinase; muscle 


2.036 


107109 AA609943 


Hs.32793 


ESTs 


2.034 


117040 H89112 




yw25e5.s1 Morton Fetal Cochlea Homo sapiens cONA done IMAGE2532B 


2.034 


132906 AA142857 


Hs^34896 


ESTs; Highly similar to geminin {Rsapiens] 


2.031 


105479 AA255546 


Hs.23467 


ESTs 


2.027 


102031 U04898 


Hs.2156 


RAR-retated orphan receptor A 


2.027 


119846 W80363 


Hs.58446 


ESTs 


2.024 


124809 R46482 


Hs.106875 


ESTs 


2.024 


130286 AA041548 


Hs.154023 


KIAA0573 protein 


2.023 


124457 N50114 . 


Hs.128704 


ESTs 


2.017 


125144 W37999 


Hs£4336 


ESTs 


2.017 


120581 AA281257 


Hs.125868 


ESTs 


2.014 


104931 AA062731 


Hs.108319 


thyroid hormone receptor-associated protein; 150 kDa subunit 


2.012 


120548 AA278846 


Hs.187634 


ESTs 


2.011 


113933 W81362 


Hs.30567 


ESTs 


2.011 


123072 AA485041 


Hs.104308 


ESTs 


2.009 


123648 AA609323 


Hs.1 12689 


ESTs 


2.008 


116875 H67749 


Hs.161022 


EST 


2.003 


103179 X69398 


Hs.82685 


CD47 antigen (Rh-related antigen; integrin-assodated signal transducer) 


1.995 


103478 Y07755 


Hs.38991 


S100 caltiunvbinding protein A2 


1.995 


111007 N53378 


Hs.22543 


ESTs 


1.995 


120470 AA251797 




zs11f3.s1 NCLCGAPJSCB1 Homo sapiens cDNA done 


1.989 


112280 R53457 


Hs.26040 


ESTs; Weakly similar to fatty add omega-hydroxylase [Rsapiens] 


1.989 


114127 Z38652 


Hs.106961 


ESTs; WeaWy similar to TYL [H.sapiensj 


1.988 


129863 AA151005 


Hs.129872 


sperm surface protein 


1.988 


106320 AA436608 




ESTs 


1.988 


108933 AA147224 


Hs.71814 


ESTs 


1.986 


105906 AA401633 


Hs22380 


ESTs 


1.982 


109029 AA157911 


Hs.72200 


ESTs 


1.982 


118470 N66769 


Hs.82781 


ESTs 


1.975 


115358 AA281886 


Hs.88923 


ESTs 


1.975 


115257 AA279060 


Hs.193516 


B-ceilCLUIymphoma 10 


1.974 


126879 AA719776 




2h38g04.s1 Soaresj>ineal_gland_N3HPG Homo sapiens cDNA done IMAGE:41 4390 1 .974 


109547 F01479 


Hs.26966 


ESTs 


1.973 


127111 AA805726 


Hs.220509 


ESTs 


1.969 


101266 136645 


Hs.73964 


EphA4 


1.966 


129319 AA037467 


Hs.30340 


ESTs 


1.965 


106211 AA428240 


Hs.126083 


ESTs 


1.962 


112753 R93696 


Hs.169882 


ESTs 


1.961 


120489 AA255538 


Hs.190504 


ESTs 


1.959 


129699 AA458578 


Hs.12017 


KIAA0439 protein; homolog of yeast ubiquitin-protein Ggase Rsp5 


1.956 


105425 AA251129 


Hs.24416 


ESTs 


1.953 


134740 L37362 


Hs.89455 


opioid receptor; kappa 1 


1.95 


109324 AA210700 


Hs.86405 


Homo sapiens mRNA; cDNA DKFZp564P056 (from done DKFZp564P056) 


1.95 


124303 H93043 


Hs.107070 


ESTs 


1.95 


102337 U36922 




Human fork head domain protein (FKHR) mRNA, 3 end 


1.948 


109441 AA228100 


Hs.86998 


nudear factor of activated T-ceOs 5 


1.946 


127364 AA179573 


Hs.90061 


progesterone binding protein 


1.942 


105255 AA2Z7498 


Hs.3623 


ESTs 


1542 


130672 L19783 


Hs.177 


phosphatidyfinositol gtycan; dass H 


1542 


104301 D45332 


Hs.6783 


ESTs 


154 


132442 R62589 


Hs.167419 


ESTs 


1539 


105519 AA256063 


Hs.23438 


ESTs 


1.937 


132902 AA490969 


Hs.168147 


ESTs 


1.936 


118873 N89881 


Hs.44577 


ESTs 


1536 


114124 Z38595 


Hs.125019 


ESTs; Highly simflar to KIAA0886 protein [H^aptens] 


1534 


115075 AA255486 


Hs.88045 


ESTs 


1533 
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110695 H93463 Hs.124777 ESTs 1.931 

105360 AA236209 Hs.187626 ESTs 1.931 

124998 T56013 Hs.77910 34»ydroxy-3^n8thyfglutaryl-Coen2yme A synthase 1 (soluble) 1.929 

121816 AA424814 Hs.187509 ESTs 1.927 

111717 R23241 Hs.110776 STAT Induced STAT inhibitor-2 1.925 

128874 H06245 Hs.106801 ESTs 1.925 

109391 AA219699 Hs. 1 84245 KIAA0929 protein Msx2 interacting nuclear target (MINT) homotog 1.913 

126129 H82165 Hs.40334 ESTs 1.911 

115553 AA369027 Hs.71414 ESTs 1505 

113811 W44928 Hs.4878 ESTs 1.905 

1 08345 AA070906 zm66d1 .s1 Stratagena neuroepithelium (#937231 ) Homo sapiens cONA clone 1 504 

120472 AA251875 Hs.104472 ESTs; Weakly similar to Gag-Pol pofyprotetn [M.musculus] • 1.903 

116602 D80063 Hs.241673 EST 1.901 

121121 AA399371 Hs.189095 ESTs; Weakly similar to zinc finger protein SALL1 [H.sapiens] 1.9 

125330 AA401804 Hs.114574 ESTs 1.896 

130095 F01831 Hs.14838 ESTs 1394 

119782 W72982 Hs38262 ESTs 1.894 

104115 AA428090 H^26102 ESTs 1393 

131313 C17938 Hs.22370 Homo sapiens mRNA; cONA DKFZp564O0122 (from done DKFZp564O0122) 1.891 

105583 AA278907 Hs.24549 ESTs 1.891 

122825 AA461195 Hs.99580 ESTs 1.887 

119495 W35390 Hs.55533 ESTs 1386 

130309 AA134289 Hs.15423 Homo sapiens B AC done RG114B 19 from 7q31.1 1386 

125628 AA418069 Hs.241493 natural killer-tumor recognition sequence 1386 

110611 H66947 Hs.14671 ESTs; Highly similar to gene ERCC5 protein fH.sapiensJ 1385 

117301 N22569 Hs.43215 ESTs 1384 

131406 N92239 Hs.26471 Wnt inhibitory factor-1 1.881 

126428 AA013312 Hs.64988 ESTs 1.881 

120285 AA182882 Hs.111110 rjtin-cap (telethonin) 1.878 

112724 R91753 Hs.17757 ESTs 1378 

103121 X63679 Hs.4147 translocating chain-associating membrane protein 1.875 

124381 N26765 Hs.109008 ESTs 1375 

117226 N20468 Hs.177322 ESTs; Weakly similar to putative p150 (H.sapiens] 1375 

105610 AA279991 Hs.124691 ESTs; Weakly similar to trithorax homotogue 2 [H.sapiens] 1.875 

111229 N69113 Hs.110855 ESTs 1375 

120627 AA285079 Hs.190474 ESTs 1.873 

107048 AA600012 Hs.10669 ESTs; Moderately similar to KIAA0400[H. sapiens] 1372 

104041 AA381902 Hs.197114 RNA binding protein 1.872 

1 15162 AA258366 Hs.227606 ras GTPase activating protein* 1 .872 

102239 U26726 Hs.1376 hydroxysteroid (11 -beta) dehydrogenase 2 137 

100043 M10098 AFFX controh 18S ribosomal RNA 1.868 

120296 AA191353 Hs^2385 ESTs; Weakly similar to KIAA0970 protein [H.sapiensJ 1367 

129011 S72869 Hs.107932 ONA segment; single copy; probe pH4 (transforming sequence; thyrold-1; 1.867 

134851 R44479 Hs.9Q232 KIAA0552 gene product 1366 

117392 N26175 Hs.93405 ESTs 1364 

114530 AA053027 Hs.191797 ESTs 1.863 

123541 AA608794 Hs.1 12592 ESTs 1.863 

124890 R78618 Hs.34145 ESTs; Weakly similar to RAS-RELATED PROTEIN RAB-8 (H.sapiens) 1.862 

105299 AA233511 Hs.194720 ATP-binding cassette; sub-family G (WHITE); member 2 1.861 

103560 Z20656 Hs.182787 myosin; heavy polypept 6; cardiac muscle; alpha (cardiomyopathy; hypertrophic 1) 1.861 

113073 T33637 Hs.6841 ESTs 136 

120407 AA235040 Hs.107283 ESTs 1359 

103892 AA243523 Hs.17155 ESTs - 1.858 

123795 AA620381 Hs.70488 ESTs 1357 

108524 AA084323 Hs.68138 ESTs 1.857 

113953 W85812 Hs.187554 ESTs 1.856 

110721 H97678 Hs.31319 ESTs 1.856 

129426 AA412087 Hs.168272 EST; Highly smlr to prot inhibitor of activated STAT prot PIASx-alpha (H^apiens) 1.853 

112102 R44840 Hs.21303 ESTs 1.852 

118502 N67317 Hs.50150 ESTs 1.852 

107619 AA004955 Hs30015 ESTs 1.851 

100436 D87446 Hs.75912 K1AA0257 protein 1.85 

120652 AA287312 Hs.191648 ESTs 1.85 

121643 AA417078 Hs.193767 ESTs 1.843 

117387 N26011 Hs33810 ESTs 1.843 

132084 Y12394 Hs.3886 kaiyopherin alpha 3 frmportin alpha 4) 1343 

124449 N48593 Hs.121820 ESTs 1.841 

120263 AA173440 Hs.193919 ESTs 1.838 

127226 AA731036 Hs3463 ribosomal protein S23 1.838 
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111837 R36447 Hs.24453 ESTs 1.835 

128727 M64174 Hs.50651 Janus kinase 1 (a protein tyrosine kinase) 1.834 

114439 AA018937 Hs.128629 ESTs 1.833 

102332 U35637 Human nebulin mRNA, partial cds 1.83 

126579 W72979 Hs.146082 ESTs 1.83 

102341 U37122 Ks.6110 adducin 3 (gamma) 1.83 

114246 Z39848 Hs. 12079 ESTs 1.828 

131757 D17532 Hs.316 DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 6 (RNA helicase; 54k0) 1.823 

108904 AA136521 Hs.71148 ESTs; Weakly similar to putative p150 [Rsapiens] 1.823 

115084 AA255566 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564C053 (from done DKFZp564C053) 1.823 

131957 AA609008 Hs.183232 ESTs 1.822 

100131 D12465 Hs.11951 phosphodiesterase l/nudeotide pyrophosphatase 

1 (homologous to mouse Ly-41 antigen) 1 .822 

124163 H30539 Hs.189838 ESTs 1.821 

118204 N59859 Hs.48443 ESTs 1.821 

107727 AA016021 Hs.173091 DKFZP434K1 51 protein 1.82 

100357 D78156 Hs24 1 548 RASp21 protein activator 2 1.82 

116295 AA489016 Hs.91216 ESTs; Highly similar to partial COS; human putative tumor suppressor [H^apiens] 1.82 

124833 R54112 Hs.128697 ESTs 1.817 

122587 AA453255 Hs.6968 ESTs 1.817 

114359 Z41589 Hs.153483 ESTs; Moderately similar to H1 chloride channel [H.sapiens] 1.815 

111289 N72253 Hs.238246 ESTs 1.813 

110826 N30068 Hs.15347 ESTs 1.812 

104106 AA422123 Hs.42457 ESTs 1.811 

130043 AA055404 Hs.193953 ESTs; WeaWy similar to I! ALU SUBFAMILY J WARNING EfVTRY !! [H.sapiens) 1 253 

115864 AA432080 Hs31200 ESTs 131 

129737 AA056140 Hs.122684 ESTs 1.81 

124477 N53158 Hs.102682 ESTs 1.809 

100782 HG3740-HT4010 Basic Transcription Factor 2, 34 Kda Subunit 1.806 

106101 AA421053 Hs.34395 ESTs 1.806 

115479 AA287596 zs52h09.s1 NCLCGAPJ5CB1 H sapiens cDNA done IMAGE:701 153 1.804 

116104 AA456635 Hs.78524 ESTs 1.804 

114173 Z39050 Hs.21963 ESTs 1.804 

132632 N59764 Hs.5398 guanine-monophosphate synthetase 1.803 

119135 R49548 Hs.1 69681 death effector domain-containing 1.802 

131559 N91087 Hs^8728 ESTs; Weakly similar to F55A12.9 [Celegans] 1301 

126922 AA177138 Hs.161671 ESTs 1.8 

117375 N25427 Hs.108812 ESTs 1.0 

103571 Z25535 Hs.211608 nudeoporin 153kD 13 

105978 AA406367 Hs.15973 ESTs 13 

125904 H22372 Hs.163586 ESTs 1.799 

133883 AA397915 Hs.77221 choline kinase 1.798 

105777 AA348412 Hs^3096 ESTs 1.797 

110166 H19480 Hs.174309 ESTs 1.796 

105038 AA130273 Hs.7584 ESTs; Weakly similar to hypothetical protein; similar to [Rsaplens] 1.796 

105427 AA251330 Hs^8248 ESTs 1.795 

115278 AA279757 Hs.67466 ESTs; Weakly similar to 8ACN32G1 1.d [Cmelanogaster] 1.794 

133104 L13698 Hs.65029 growth arrest-specific 1 1.794 

131170 N48674 H&23796 Human DNA sequence from done 1052M9 on chromosome Xq25. Contains the 1.792 

100136 D13540 H&22868 protein tyrosine phosphatase; non-receptor type 11 1.791 

127263 AA331 157 EST35035 Embryo, 6 week, subtracted (total cDNA) I Homo sapiens cDNA 1 .79 

114157 238878 Hs.24979 ESTs 1.79 

125601 AI096717 Hs.247043 KIAA0525 protein - 1.788 

118472 N66818 Hs.42179 ESTs 1.787 

112456 R63925 Hs^8464 ESTs 1.787 

130236 N69682 Hs31957 SC35-interacting protein 1 1.786 

133297 AA600057 Hs.70266 KIAA0905 protein 1.784 

125650 R40096 Hs.176578 ESTs 1.784 

132056 T89386 Hs.38176 KIAA0606 protein; SCN Circadian Oscillatory Protein (SCOP) 1.783 

129093 AA262710 Hs.108614 KIAA0627 protein 1.783 

123176 AA489020 Hs.193424 ESTs 1.782 

106340 AA441792 Hs.22857 chord domain-containing protein 1 1.781 

100598 HG2463-HT2559 Guanine Nudeotide-Binding Protein G25k 1.779 

104038 AA374532 EST86676 KSC172 cells I Homo sapiens cONA 5' end, mRNA sequence 1.778 

122235 AA436475 Hs.190104 ESTs 1.777 

105104 AA151771 Hs.76941 ATPase; NaWK+ transporting; beta 3 polypeptide 1.776 

107601 AA004636 Hs50223 ESTs 1.776 

131467 W68255 H&27194 DKFZP434K171 protein 1.776 

118449 N66413 Hs.172466 ESTs; Weakly similar to KIAA0775 protein [H.sapiens) 1.776 
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107969 AA034030 Hs.155212 methytmalonyl Coenzyme A mutase 1.775 

115527 AA342079 Hs252055 ESTs 1.775 

132471 T16305 Hs.49349 beta-site APP-cteaving enzyme 1.775 

105966 AA406105 Hs5344 adaptor-related protein complex 1; gamma 1 subunit 1.774 

127548 AA373091 Hs.93832 Homo sapiens done 24483 unknown mRNA; parita! cds 1.774 

106217 AA428379 Hs^4870 ESTs 1.773 

131214 N26777 Hs.172635 ESTs 1.773 

106295 AA435664 Hs.8583 similar to APOBEC1 1.773 

106328 AA436705 Hs£8020 K1AA0766 gene product 1.772 

124661 N93797 Hs.3090 EphB1 1.772 

122988 AAA79166 Hs.105633 ESTs 1.772 

115504 AA291946 Hs.42736 ESTs 1.771 

105168 AA180208 Hs.16606 ESTs; Highly similar to CGI-32 protein [H^apiens] 1.767 

129153 AA188618 Hs.181461 ariadne; Drosophila; homolog of 1.766 

105829 AA398290 Hs^1965 ESTs 1.764 

101811 M86917 Hs^4734 oxysterol binding protein 1.764 

100138 D13628 Hs£463 angiopoietin 1 1.764 

124704 R07335 ye96d.s1 Soares fetal liver spleen 1NFLS Homo sapiens cONA clone 1.763 

122314 AA442257 Hs.192076 ESTs 1.762 

109865 H02566 Hs.191268 Homo sapiens mRNA; cONA OKFZp434N174 (from clone DKFZp434N174) 1.761 

106206 AA428069 Hs.89519 KIAA1 046 protein 1.758 

107135 AA620782 Hs.23247 ESTs 1.757 

105760 AA338960 Hs28170 ESTs 1.756 

106288 AA435536 Hs.24336 ESTs 1.756 

103968 AA304566 Hs.3542 ESTs 1.756 

129559 AA234945 Hs.11360 ESTs 1.756 

117885 N50112 Hs.47023 ESTs 1.754 

107032 AA599472 Hs.247309 succinate-CoA Irgase; GDP-forming; beta subunit 1.754 

124807 R45963 Hs.233811 ESTs; Weakly similar to ORF2 [M.musculus] 1.753 

100276 D42047 Hs.82432 KIAA0089 protein 1.753 

110924 N47938 yy84a09.s1 Soares jnultiple_sderostsJ>NbHMSP Homo sapiens cDNA clone 1.751 

133002 AF006082 Hs.62461 ARP2 (actin-related protein 2; yeast) homolog 1.751 

132530 AA455917 Hs.50785 SEC22; vesicle trafficking protein (S. cerevisiae)-like 1 1.75 

110759 N21671 Hs.19025 ESTs 1.75 

106138 AA424515 Hs.33264 ESTs 1.75 

107348 U43701 Hs.184776 ribosornal protein L23a 1.75 

115867 AA432162 Hs.165986 DKF2P586B2022 protein 1.749 

135398 AA194075 Hs.99908 nuclear receptor coactivator 4 1.747 

113783 W19222 Hs.7041 ESTs; Weakly similar to !! ALU SUBFAMILY SQ WARNING ENTRY !! [Ksaplens] 1.747 

134898 X98330 Hs.90821 ryanodine receptor 2 (cardiac) 1.745 

132215 T10132 Hs.4236 K1AA047B gene product 1.744 

104229 AB0G2346 Hs.61289 synaptojanin 2 1.743 

116166 AA461556 Hs.202949 KIAA1 102 protein 1.743 

115433 AA284252 Hs.58372 ESTs 1.743 

114908 AA236545 Hs.54973 ESTs 1.742 

127425 AA470941 Hs.143162 ESTs 1.741 

131089 Z38807 Hs^2870 ESTs 1.739 

113498 T88908 Hs.189746 ESTs 1.738 

116710 F10577 Hs.70312 ESTs 1.735 

127210 R51476 yg76f04.r1 Soares infant brain 1N1B Homo sapiens cDNA done 1.733 

120554 AA279654 Hs.194524 ESTs 1.733 

129940 U18242 Hs.13572 calcium modulating figand 1.732 

117023 H88157 Hs.41105 ESTs - 1.731 

111700 R22212 Hs.23361 ESTs 1.731 

116911 H72240 Hs.39292 ESTs; Moderately similar to KIAA0745 protein [H .sapiens] 1.731 

106025 AA412063 Hs£065 ESTs 1.728 

108626 AA101984 Hs.61697 G-protein coupled receptor 1.726 

111614 R12581 Hs.191146 ESTs 1.726 

134134 L76703 Hs.173328 protein phosphatase 2; regulatory subunit B (B56); eps2on isoform 1.725 

106886 AA489086 Hs.36545 ESTs 1.725 

117998 N52136 Hs.93828 ESTs 1.725 

121204 AA400422 Hs.55896 ESTs 1.725 

121342 AA404995 Hs.192480 ESTs 1.725 

131129 R27296 Hs.23240 ESTs 1.725 

116235 AA479181 Hs.186726 ESTs 1.725 

102423 U44754 Hs.179312 small nudear RNA activating complex; polypeptide 1;43kD • 1.724 

110273 H29050 Hs.24096 ESTs 1.722 

108758 AA127395 Hs.222414 ESTs 1.722 

110672 H88477 Hs.191178 ESTs 1.721 
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120271 AA176404 Hs.111092 ESTs; Weakly similar to ZINC FINGER PROTEIN 136 [H.sapiens] 1.72 

100227 D26915 Hs.82316 interfercsHnduced; hepatitis C-associated microtubular aggregate prot (44kD) 1719 

129232 W69459 Hs.109655 sex comb on mkfieg (DrosophiiaHike 1 1.719 

134663 W73367 Hs.8750 ESTs 1.717 

5 1O4902 AA055475 Hs.104143 dathrin; light polypeptide (Lea) 1.717 

120582 AA281290 Hs.125287 ESTs; Weakly similar to BC331 191J (Haptens) 1 .717 

134891 F03517 Hs.90787 ESTs 1.716 

106219 AA428567 Hs.26613 Homo sapiens mRNA; cDNA DKFZp586F1323 (from clone DKFZp586F1323) 1.715 

116372 AA521311 Hs.13854 ESTs 1.713 

10 107570 AA001870 Hs.237323 N-acetytgfucosaininei)hosphaternutase; 0KFZP434B1 87 protein 1.713 

106198 AA427816 Hs.11803 ESTs 1-712 

125136 W31479 Hs.129051 ESTs 1.712 

104973 AA085676 Hs.6763 KIAA0942 protein 1.712 

128710 J04813 Hs.104117 cytochrome P450; subfamily IIIA (niphedipine oxidase); polypeptide 5 1.711 

15 123994 020899 Hs.107127 Homo sapiens mRNA; cONA OKFZp564G022 (from done DKFZp564G022) 1.711 

127871 AA766511 Hs.128848 ESTs 1.71 

116089 AA455933 Hs.41324 ESTs 1.709 

123337 AA504153 Hs.132797 ESTs; Weakly similar to ORF YGL050w [S.cerevisiaeJ 1.708 

123619 AA609200 Hs.162686 ESTs 1.708 

20 104781 AA026617 H&21610 ESTs; Highly similar to BAI1-associated protein 1 (H^apiens) 1.707 

115114 AA256468 Hs.88148 ESTs 1.705 

117852 N49408 Hs:1361Q2 KIAA0853 protein 1.705 

127644 T57570 Hs77039 ribosomal protein S3A 1.704 

111359 N91273 Hs27179 ESTs 1.702 

25 131721 L36644 Hs.31092 EphA5 1.7 

132438 F08925 Hs.48610 ESTs 1.7 
132476 N67192 Hs.49476 Homo sapiens clone TUA8 CrkJu-chat region mRNA 1.7 
130990 F02488 Hs.21917 KIAA0768 protein 1.7 
128499 AA487503 Hs.100636 ESTs 1.698 

30 120780 AA342337 Hs.241569 ESTs; Modlty smlr to H ALU SUBFAMILY SQ WARNING ENTRY !! [H.saptensJ 1.697 

132920 L06133 Hs.606 ATPase; Cu++ transporting; alpha polypeptide (Menkes syndrome) 1.696 

135037 U77948 Hs.184122 general transcription factor II; i 1.696 

110024 H11297 Hs.31050 ESTs 1.695 

134415 AA329274 Hs.82911 protein tyrosine phosphatase type IVA; member 2 1.694 

35 102223 U24685 Hs.148226 Human anti-B cell autoantibody IgM heavy chain variable V-D-J region (VH4) 

gene; clone E1 1; VH4-63 non-productive rearrangement 1.694 

126712 AA205862 Hs.7942 ESTs 1.694 

101507 M27492 Hs.82112 interieukin 1 receptor; type I 1.692 

106291 AA435551 Hs.30824 ESTs 1.691 

40 116826 H58691 Hs.8215 ESTs; Weakly similar to double-stranded RNA-binding nuclear 

protein DRSBP76 [H^apiens] 1 .69 

135339 059269 Hs.127842 Homo sapiens mRNA full length Insert cDNA clone EUROIMAGE 783648 1.69 

1 18250 N62602 yz75b6.s1 Soares^muitq3le_sclerosis_2NbHMSP Homo sapiens cDNA done 

IMAGE288851 3* similar to contains Afu repetitive element;, mRNA sequence 1.689 

45 106470 AA450116 Hs.186180 ESTs 1.688 

108203 AA057678 Hs.63408 ESTs 1.687 

119748 W70313 Hs.126906 ESTs 1686 

116576 051228 Hs.79404 neuron-specific protein 1.683 

123035 AA481392 Hs.105166 ESTs 1.683 

50 126668 AA011616 Hs.184086 ESTs 1.681 

101512 M28209 Hs.250716 RAB1; member RAS oncogene family 1.678 

102704 U76638 Hs.54089 BRCA1 associated RING domain 1 1.677 

126218 AA256386 Hs.13649 Novel human gene mapping tochomosome 13;similartoratRhoGAP 1.676 

111180 N67277 Hs.9403 ESTs 1-676 

55 105937 AA404342 Hs.173531 ESTs 1.675 

114118 238520 Ks.175930 ESTs 1.675 

109203 AA190634 Hs.108787 endoplasmic reticulum membrane protein 1.675 

125245 W86608 Hs.7243 ublquiiin specific protease 24 1.675 

102906 X06956 Hs.75318 tubulin; alpha 1 (testis specific) 1.675 

60 125914 AA262925 Hs.180034 cleavage stimulation factor; 3 pre-RNA; subunit 3; 77kD 1.674 

134294 U63289 Hs.81248 CUG triplet repeat; RNA-binding protein 1 1.674 

109742 F10108 Hs.183333 ESTs 1.673 

134674 D63876 Hs.87726 KIAA01 54 protein 1.673 

104079 AA4Q2937 Hs. 103238 ESTs 1.671 

65 107554 AA001386 Hs39844 ESTs 1.671 

132439 AA243139 Hs.4863 Homo sapiens done 25088 mRNA sequence 1.669 
124515 N58172 Hs.109370 ESTs • 1668 
124300 H92575 Hs.105959 ESTs; Weakfy similar to 0 ALU SUBFAMILY SQ WARNING ENTRY II [H.sapiens) 1.668 
126809 AA743475 Hs.171693 ESTs 1.667 
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106095 AA419547 Hs.1 1713 ESTs 1.664 
101754 M77142 Hjl2394S9 T1A1 cytotoxic granule-associated RNA-binding protein 1.653 
105188 AA192306 Hs^3926 ESTs ^ 1 663 

113582 T91371 Hs.16824 EST 1.661 
119559 W38197 Accession not listed in Genbank 1.661 

119961 WB7535 Hs.59015 ring finger protefo 9 1657 
123255 AA490890 Hs.105273 ESTs 1 657 

111078 N59230 Hs.186574 ESTs 1655 
113082 T40528 Hs.8246 ESTs 1654 
119589 W44692 Hs.124177 ESTs 1.652 
104308 D53639 Hs.77904 ribosomal protem S26 }£5 
103073 X59417 Hs.74077 proteasome (prosome; macropain) subunit; alpha type; 6 165 
124424 N35314 Hs.107265 ESTs 1.65 
128890 AA096157 Hs.182364 ESTs; Weakly similar to 25 kDa trypsin inhibitor [H.sapiens] 1.65 
1 19400 T92767 ye27d06.s1 Stratagene lung (#937210) Homo sapiens cDNA done 

IMAGE1 18955 3 1 , mRNA sequence. 1.65 

131631 AA486868 Hs^9802 sSt (DrosopMa) hornolog 2 165 

118229 N62339 Hs.180532 heat shock 90kD protein 1; alpha 1649 

118533 N67954 Hs.49413 ESTs 1648 

130666 AA476307 Hs.194035 KIAA0737 gene product 1.647 

103093 X60708 Hs.44926 dipeptidytpeptidase IV {CD26; adenosine deaminase complexing protein 2) 1.647 

128667 U69140 Hs.103419 fasdculation and elongation protein zeta 2 (zygin II) t.646 

112933 T15530 Hs^2 1439 ESTs 1646 

114546 AA056263 Hs.132747 ESTs 1645 

126705 AA579377 Hs.180532 heat shock 90kD protein 1; alpha 1.644 

114399 AA007595 Hs.220937 ESTs 1642 

118836 N79820 Hs.50854 ESTs 1.64 

100401 D85423 Homo sapiens mRNA for Cdc5, partial cds 164 

105681 AA264865 Hs.171228 KIAA1 040 protein 1.639 

132526 AA460128 Hs.5074 similar to S.pombedimU 1639 

133809 AA034002 Hs.76359 catatase 1 639 

115968 AA447083 Hs.134522 ESTs 1.637 

116370 AA521256 Hs.236204 ESTs; Moderately similar to NUCLEAR PORE COMPLEX 

PROTEIN NUP107 [Rjwrveglcus] 1 .631 

109644 F04477 HS.2048Q2 ESTs; Moderately similar to GLYCERALDEHYDE 3-PHOSPHATE 

DEHYDROGENASE; UVER [H.saplens] 1.627 

103427 X97303 Haptens mRNA for Ptg-12 protein 1.627 

132186 T33888 Hs£2104O KIAA1 038 protein 1.626 

131428 U17838 Hs£6719 PR domain containing 2; with ZNF domain 1626 

126638 AA649257 Hs.188602 ESTs 1 625 

114503 AA039568 Hs. 188033 ESTs 1625 

121242 AA400857 Hs.97509 EST 1.625 

122414 AA446885 Hs.99087 ESTs; Moderately similar to ZINC FINGER PROTEIN 141 [H.sapiens] 1 .625 

110632 H72344 Hs.171635 ESTs 1.624 

111389 N95837 Hs.169111 ESTs; Weakly similar to L82A [D.meianogaster] 1.624 

112449 R63802 Hs.124186 rmg finger protein 2 " 1.623 

113070 T33464 Hs.6298 ESTs 1.622 

107229 D59284 Hs.34644 ESTs 1.618 

132710 W93726 Hs.55279 protease inhibitor 5 (maspin) 1.617 

124664 N94814 Hs.33540 ESTs; Weakly similar to KIAA0765 protein (H^apiensJ 1.617 

130166 AA350690 Hs.151411 KIAA091 6 protein 1.616 

125040 T78451 Hs.199961 ESTs 1.615 

132972 H39627 Hs.164967 ESTs; Weakly similar to !I ALU SUBFAMILY SB WARNtNG ENTRY Q [H^apiens] 1.615 

115873 AA433916 Hs.90093 heat shod 70kD protein 4 1.611 

120408 AA235045 Hs.190151 ESTs 161 

120934 AA383773 Hs.191500 ESTs 1.61 

115259 AA279071 Hs.13453 splicing factor 3b; subunit 1; 155kD 1.609 

134330 D20113 Hs.8185 ESTs; Highly similar to CGM4 protein [H.sapiens] 1.607 

115117 AA256492 Hs.49007 poly(A) polymerase 1606 

125162 W44682 Hs.109896 ESTs 1.605 

103946 AA285246 Hs.1 1 1650 ESTs; WeaWy similar to Prtl homotog (H^apfens) 1.604 

133389 AA166917 Hs.72639 ESTs 1.603 

115528 AA342301 Hs.53929 ESTs; Weakly similar to 11 ALU CLASS B WARNING ENTRY I! [H.sapiens] 1.602 

129704 W81301 Hs.12064 ubiquitin specific protease 22 1.602 

109313 AA206800 Hs.86276 ESTs; Moderately similar to zinc finger protein dp [H^apiens] 1.601 

130457 U58091 Hs. 1 55976 cu En 4B 16 

123076 AA485211 Hs.1 90046 ESTs 16 

115113 AA256460 Hs.44610 ESTs 16 

117731 N46433 Hs.46609 ESTs 1.6 
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123344 AA504338 Hs.171857 ESTs 1.599 

131798 X86093 Hs.3238 adenovirus 5 El A binding protein 137 

125370 AA256743 Hs.151791 K1AA0092 gene product 1.596 

114918 AA236813 Hs.72324 ESTs; Highly similar to unknown [H.sapiens] 1.596 

5 114807 AA16G805 Hs.199832 ESTs 1.596 

105103 AA151593 Hs.10130 ESTs 134 

125004 T60120 yb68!Q2.s1 Stratagene ovary (#937217) Homo sapiens cDNA clone 

IMAGE:763473\mRNA sequence. 1.592 

105658 AA282914 Hs.10176 ESTs 1.589 

10 110455 H52172 yt85e8.s1 Soares_pineaLsland_N3HPG Homo sapiens cDNA done 

IM AGE231 1 1 3' similar to contains Alu repetitive element;, mRNA sequence 1 .589 

119780 W72967 Hs.191381 ESTs; Weakly similar to hypothetical protein [H^apiens] 1.587 

126983 AA21 1537 zn55d01.fi Stratagene muscle 937209 Homo sapiens cONA clone 

IMAGES62G815\mRNA sequence. ' 1.586 

15 134675 AA250745 Hs.87773 protein kinase; cAMP-dependent; catalytic; beta 1.584 

105431 AA252033 Hs.15036 ESTs; Weakly similar to II ALU SUBFAMILY J WARNING ENTRY I! (H.sapiensJ 1584 

120187 Z40251 Hs.56974 ESTs 1.584 

115830 AA428137 Hs.86434 ESTs 1581 

135069 AA456311 Hs.93961 ESTs; Weakly similar to II ALU CLASS A WARNING ENTRY !! [H.sapfens] 1.581 

20 122997 AA479295 Hs.106290 Ketch motif containing protein 1.581 

119707 W67569 Hs.44143 ESTs; Weakly similar to SNF2alpha protein [Ksapiens] 1.58 

131934 D80948 Hs.34922 ESTs 158 

106141 AA424558 Hs.9302 phosdudn-like 1.58 

115271 AA279422 Hs.5724 ESTs 1.579 

25 131468 R27598 Hs.27197 KIAA0797 protein 1577 

131165 R98173 Hs.23763 Max-interacting protein 1575 

117273 N21680 Hs.43047 ESTs 1.575 

101569 M33772 Hs.182421 troponin C2; fast 1575 

1 16127 AA459703 Hs.79070 v-myc avian myetocytomatosis viral oncogene homolog 1 575 

30 120022. W90625 Hs58432 ESTs 1.575 

117512 N32157 Hs.82207 ESTs 1574 

106511 AA452855 Hs.206713 UDP-GalfcetaGlcNAcbeta 1 ;4- galactosyltransferase; polypeptide 2 1573 

116415 AA609204 Hs.27973 KIAA0874 protein 1.573 

127879 AAS10215 . Hs.189079 ESTs 1.571 

35 125211 W72798 Hs.103177 ESTs; Wkly smlr to cDNA EST EMBLD32579 comes from this gene [CeJegans] 1571 

114746 AA135638 Hs.223756 ESTs 1.571 

122698 AA456112 Hs.99410 ESTs 1.57 

116765 H12636 Hs.121585 ESTs; Weakty similar to reverse transcriptase (Ksapiens) 1.568 

130895 AA609828 Hs.21015 ESTs; Highly similar to tetracycline transporter-like protein [Mjnuscutus] 1568 

40 114338 241366 Hs.40109 KIAA0872 protein 1567 

111005 N53076 Hs.5996 ESTs 1567 

128135 AA913491 Hs.189143 ESTs; Modrtly smlr to II ALU SUBFAMILY J WARNING ENTRY !! [H.sapiens] 1567 

112046 R43365 Hs.22273 ESTs 1.566 

132160 AA281770 Hs.184081 seven in absentia (Drosophfla) homolog 1 1566 

45 111568 R10153 Hs.20561 ESTs 1566 

127775 H04106 Hs.179902 ESTs; Weakly similar to NG22 [H. sapiens] 1.566 

115359 AA281936 Hs.88914 ESTs 1.566 

121845 AA425734 Hs.165066 ESTs; Weakly similar to hypothetical protein [H.sapiens] 1.565 

127854 AA769520 ESTs; WeaWy similar to REGULATOR OF MITOTIC SPINDLE 

50 ASSEMBLY 1 (H^apiens] 1564 . 

120287 AA187679 Hs.111114 ESTs 1563 

114940 AA243012 Hs.75928 ESTs 1562 

126716 AA031700 Hs.251962 ESTs " 1-562 

134161 U97188 Hs.79440 IGRI mRNA-binding protein 3 1.561 

55 125390 H95094 Hs.75187 transtocase ol outer mitochondrial membrane 20 (yeast) homolog 1561 

115334 AA281244 Hs.65300 ESTs 1559 

113721 T97931 Hs.18190 EST 1558 

114895 AA236177 Hs.76591 KIAA0887 protein 1.558 

119341 T62571 Hs.146388 microtubule-assodated protein 7 1558 

60 108012 AA039616 Hs.61933 ESTs 1.558 

130335 AA156499 Hs.8454 protein kinase; cAMP-dependent; regulatory; type II; alpha 1557 

134351 R82074 Hs.82109 syndecan! 1557 

133300 D51401 Hs.70333 ESTs 1.553 

106920 AA490899 Hs.24462 ESTs 1553 

65 118744 N74075 Hs.94293 EST 1552 

126489 W20016 Hs.144228 ESTs; Weakly similar to ZINC FINGER PROTEIN 83 [H.saptens] 155 

115913 AA436720 Hs.65487 ESTs 155 

107868 AA025234 Hs.61260 ESTs 155 

134520 N21407 Hs^57325 ESTs 155 
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109703 F096B4 Hs.24792 ESTs; Weakly sun9artoORF YOR283w [S.cerevisiae] 1.55 

120288 AA187938 Hs-55189 ESTs; Weakly simitar to F2SB5.3 [C.etegans] 1548 

106356 AA443277 H&31034 peroxisomal biogenesis factor 11 A 1.548 

129460 AA235627 Hs.11171 APG5(autophagy5; S. cerevisiae)-Gke 1.547 

133950 011961 Hs.77823 ESTs 1.546 

128172 AI400862 Hs.142607 ESTs 1546 

114162 238909 Hs.22265 ESTs 1545 

101803 M86546 HsJ55691 pre-B-cell leukemia transcription factor 1 1.544 

113617 T93630 Hs.17207 ESTs 1542 

104896 AA054228 H«l23165 ESTs 1.541 

114477 AA032013 Hs.144260 EST 1.54 

110731 K98653 Hs.188006 KIAA0878 protein 1.54 

130367 Z38501 Hs.8768 ESTs; Wkfy smtr to !l ALU SUBFAMILY SQ WARNING ENTRY !1 [H.sapiens] 1.538 

130539 L07044 Hs^50857 Homo sapiens calriurrtfcalmalulir^^ 1538 

134921 W60186 Hs.169487 Kreisier (mouse) maf-reiated leucine zipper homotog 1537 

130583 W24957 Hs.16281 ESTs; Moderately similar to simflar to C.etegans protein 

encoded in cosmid T20D3 [H.sapiens] 1 537 

133723 AA088851 Hs.75744 S-adenwytmethionine decarboxylase 1 1537 

106450 AA449469 Hs.11859 ESTs 1.536 

104120 AA429838 Hs.69519 KIAA1 046 protein 1536 

100533 HG1879-HT1919 Ras-Like Protein Tel 0 1535 

130664 R09049 Hs.17625 ESTs 1.535 

127122 AA279153 Hs.190049 ESTs 1535 

134264 T03391 Hs.8087 ESTs 1535 

132319 AA418662 Hs.44625 ESTs 1535 

115465 AA286941 Hs.43691 ESTs 1533 

125003 T59442 Hs.100445 ESTs 1532 

102273 U30888 Hs.75981 ubiquitin specific protease 14 (tRNA-guanine transgrycosytase) 1532 

121675 AA426299 Hs.98510 ESTs 1.532 

114366 241747 Hs.469 succinate dehydrogenase complex; subunit A; flavoprotein (Fp) 1.531 

132944 AA054515 Hs.6127 ESTs;WeaWy similar to prostate-sr^dfic transglutaminase [H.sapiens] 153 

111199 N68210 HS29822 ESTs 153 

113494 T88878 Hs.258738 ESTs 1529 

129515 AA490882 Hs.1 12227 ESTs 1.528 

133124 AA156049 Hs.65490 ESTs 1528 

104785 AA027163 Hs.7942 ESTs 1526 

105595 AA279408 Hs.25866 ESTs 1.526 

130198 U67156 Hs. 151 988 mitogen-activated protein kinase kinase kinase 5 1526 

114297 240758 Hs.173091 DKFZP434K151 protein 1525 

112876 T03488 Hs.4842 ESTs 1525 

127500 AA525014 Hs.162115 ESTs 1525 

120519 AA258585 Hs.129887 cadherin 19 (NOTE: redefinition of symbol) 1525 

119859 W80702 Hs.58461 ESTs 1525 

129944 L00389 Hs.1361 cytochrome P450; subfamily I (aromatic cornpound-inducible); polypeptide 2 1.524 

118864 N89670 Hs.42148 ESTs; Weakly similar to Su(P) (Cmelanogaster] 1523 

123964 C13961 Hs.210115 EST 1523 

111676 R19414 Hs.166459 ESTs 1522 

128332 AI079523 Hs.134173 ESTs 1.522 

130455 X17059 Hs.155956 N-acetyitransferase 1 (arylamine N-aceryltransferase) 1.521 

125181 W58461 Hs.12396 ESTs 1.521 
127093 AA768241 oa72d02.s1 NC1_CGAP_GCB1 Homo sapiens cDNA done 

IMAGE:1317795 3\ mRNA sequence. 1 .521 

132156 AA157401 Hs.4113 S-adenosyihomocysteine hydrotase-Gke 1 - 1.521 

125303 239821 Hs.107295 ESTs 1.52 

132697 AA281951 Hs5518 Homo sapiens mRNA; cDNA DKFZp566J2146 (from clone DKFZp566J2146) 1.52 

117086 H93135 Hs.41840 ESTs 1519 

113355 T79203 Hs.14480 ESTs 1518 

108621 AA101811 Hs.69506 ESTs 1518 

109384 AA219172 Hs.86849 EST 1518 

128510 X94703 Hs.100816 RAB28; member RAS oncogene family 1517 

132968 N77151 Hs.61638 myosin X 1515 

117035 H88798 Hs.41182 ESTs 1515 

116781 H22985 Hs52132 ESTs ' 1513 

108677 AA1 15629 Hs.118531 ESTs 1513 

130214 H780O3 Hs.15266 ESTs 1513 

134700 AA481414 Hs5668 golgi SNAP receptor complex member 1 1512 

116618 D80783 Hs.45224 ESTs 1508 

126257 N99638 tumor necrosis factor receptor superfamfly; member 10b 1508 

125859 AA606808 Hs.1 18797 ubiquitin-oonjugating enzyme E2D 3 (homologous to yeast UBC4/5) 1508 
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113837 W57698 Hs.8888 ESTs 1.507 

114317 Z41038 Hs.469 succinate dehydrogenase complex; subunil A; flavoprotein (Fp) 1.507 

10031 1 D50640 Hs.184653 phosphodiesterase 3B; cGMP-inhibited 1 507 

126802 AA947601 Hs.97058 ESTs 1506 

128661 R82837 Hs.1 03329 KIAA0970 protein 1.506 

134194 AA233231 Hs.79828 ESTs 1.506 

108953 AA149652 Hs.42128 ESTs 1504 

133240 D31161 Hs.68613 ESTs 1.502 

132671 X763Q2 Hs.54649 putative nucleic acid binding protein RY-1 1.501 

132609 Z48923 Hs53250 bone morphogenetic protein receptor; type II (serine/threonine kinase) 1501 

105574 AA278678 Hs.258567 ESTs 15 

113718 T97782 Hs.256268 ESTs 15 

127824 AI208365 Hs.127811 ESTs 15 

130132 U55936 Hs.184376 synaptosomahassociated protein; 23kD 15 

127394 AA453224 ESTs; Weakly similar to !! ALU SUBFAMILY J WARNING ENTRY !! frUapiens) 1 5 

100485 HG1111HT1111 Ras-Uke Protein Tc21 15 

101078 L04510 Hs.792 ADP-ribosylation factor domain protein 1;64kD 15 

128611 AA456845 Hs.102471 KIAA0680 gene product 15 
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TABLE 12A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 12. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey. Unique Eos probeset identifier number 

CAT number. Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



108536 119811.1 
117040 46956.1 
100782 18457.1 



100819 3022.1 



100824 5.36 



125004 264197 J 
102313 27608.1 
102337 553.1 



124704 



124825 
110455 
126257 
125624 
104038 
103427 



292319J 
185904.1 
330773 1 
46874 T 
182217J 
154135 1 
264235 1 
43892.1 



104142 113242 J 
127093 47721.1 



AA084524 AA339253 AW966289 

AW970600 AA503323 H89218 AF086031 H891 12 

AA355435 NM.001516 230093 T28405 AW949486 AA461142 AA410532 AI652073 AA521208 AI970141 AI968234 At026102 
AA713583 AW135876 AA936614 AA770300 A1242635 AA377033 AW960263 AW607683 AI273603 AA4 10287 AI040513 
AA460838 AI80391 6 AW294095 AW449680 AW798677 AW675048 BE5421 1 6 AL120521 

L34840 NM.003241 U31905 AI546931 AI791616 AI973065 AI792321 AI546937 AI685880 Ai732835 A1682360 AA420653 
AA564047 AI682323 AI824614 AI659889 AI680052 A1970887 AI623108 AA420692 AI418074 AA631018 AI810595 AW291463 
AW449930 AI668908 AI970818 

AI393237 AI521317 AI761348 AF025841 D43968 AW994987 L34598 AF025841 D89789 D89788 D89790 AW998932 
AI971742 AI310238 X90976 AW139668 AW674280 A1365552 AA877452 AV657554 C75229 AA376077 AI798056 AW609213 
W25586 H30149 BE075089 BE075190 AW5B0858 H99598 AA425238 AA133916 AW363478 BE158121 BE158127 
AW467960 BE158135 BE158126 BE158145 N92660 AA847246 AI961688 AI361423 AA878154 M043767 AB63712 
A1559226 AW339007 AI371266 A1368901 AA046624 AA1 34739 AW4491 54 AA1 30232 A1458720 AA96251 1 AI700627 
R70437 AW0O4O08 AA045229 AI671572 H99599 AA043768 AI685454 AI871685 N29937 X90977 AA524240 AI1421 14 
AI825750 AI567805 AI631365 AI347893 AA134740 F20669 M046707 AW793216 AW963298 AW959380 AA363265 
AI784593 AI268201 R69451 AV657618 AI695588 

BE312163 AJ230798 AA374482 AI926059 AA622653 AI860704 BE139185 AW296884 T60238 T60120 
U33921 AI190489AA573311 

Af814663 AA806761 AA765241 AA019317 AA092255 AA035405 T85079 AA890151 AI373959 T85080 BE153728 AA740848 
BE080682 AL048137 AW182316 AI699468 AW274481 AW407538 AA306562 AW950024 AW949943 AL045703 AW843196 
W25132 BE612794 AA304266 AW958054 H25673 AV646563 AV646573 BE172990 AW593488 AA385181 M164998 
AK46476 AA345406 A1277554 AA134749 AA856624 BE613247 AA299003 AL048138 M028121 T92510 AJ923835 
AW020440AI401594AI889401 N93290 AA044247 AA028100 AI582845AA811151 AI741811 AB25878 AA448277 AA172221 . 
AI214783 BE220793 AA022746 AI082882 AA022849 AI928385 AA573472 AI420686 AW072902 AI799493 AI873506 
AJ468977 AI192079 AI468976 AA044272 AW015701 AW316979 AA933042 AA609017 AJ31B393 A1424571 AI934945 
AA172023 AW050917 AA846180 AA134748 AI003947 AI766769 AW006697 AA653517 AW575680 AI474214 AA401478 
U36922 AA927064 AA868000 D82654 T91745 AW500202 AA194764 M746346 AA130464 AW1 17498 AA054526 N26432 
H02534 H04964 AW303367 BE300931 AI218049 A120 8073 AW 182749 AA983630 A! 147585 AA1 94765 AA054534 AA922720 
AI436585 AI346535 AA134269 AA280923 AA897422 AA019559 AW274010 AA035406 AA917879 H99327 W32908 Ai2 16046 
AW496823 AA019414 H82288 W35284 AI936621 AI7671 13 AA866177 AW367874 H82398 AF032885 AW300151 AW467069 
AA809346 AI188507 AI494178 AA872752 AI631631 U02310 NM 002015 AA815006 AI382453 AW197658 A1761654 
AI804396 AI382221 Af813640 Af439635 AI523901 AW5f 7242 AI221705 AW298104 AW204560 AW573095 AW028783 
AW014650 AI766744 AI808294 AI698758 AI041809 AI766667 AI479103 AA872797 AA769305 AA765080 AA334166 
A1472322 
R07335R07640 

AW953679 AW953680 AA244436 H82527 AA361046 AA244483 H82526 
AA501669 R52088 
H52576 AF085971 H52172 

N99638 AW973750 AA328271 H90994 AA558020 AA234435 N59599 R94815 

AW968363 AA465492 R34539 AA165411 

AA374532AA421255 

BE514383 AA071273 AW247987 AW673286 BE312102 AW749824 BE071985 AW577383 BE071945 BE072005 AW577355 
BE071965 AW239231 BE072000 BE071960 AW577360 AW749830 AW373020 X97303 AW999522 BE000192 BE562219 
BE266655BE264970 
AA074713AA447006 

AW977549 AA256038 AL365415 AW500455 AA768241 AW968097 217849 AA256104 
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125992 
127210 



127263 
135197 

127394 
126879 
126983 
120470 
127854 
121367 
106320 



115479 
101026 



232161.1 
29440.1 

304844J 

1860.2 

171841 1 

188975.1 

443883.1 

280429.1 

6435.1 



201515.1 
11075J 



125873 10492.1 AW271838 AL133605 C01646 H29959 AA999896 060676 AW999454 AW961 176 AA315244 H 14437 AW3861 18 N46512 

AW272021 AI768516 BE466421 A1082809 AI804454 AA905101 AW173368 N38942 AW614169 AI080483 N29489 AI500550 
AA994475 AA614464 AA707368 AA593145 AA569473 AW627815 AI828244 N63226 N42300 
125954 4457.1 NM.016353 AB023584 W44753 R09585 AA382855 R23772 A1814257 AA974046 AK001608 AI935638 AW440609 AI420022 

AA777386 AA806969 A1554876 A1584006 AI688556 AI688634 AI697997 AI0 14540 AI806683 AI741202 AW263154 
AW297238 AI149951 AI589076 AW082158 AW6 14265 AA931 887 AA781 969 R09490AA484643 AI2071 21 AI088390 
A1538065 AI619547 A1741 925 AJ702846 H40846 R93943 AW747979 AA461348 U30163 AA326023 A535992 AW242870 
AI244025 A1222558 W38425 AW473630 AI624599 AI921226 AI683152 AI096458 AI123822 AW170802 C16447 AB37674 
D25726 AW339366 AW771259 AA461 174 
1589048 1 H48372W01626 
15307.6 AA305278AA223833 

110924 6443.1 AW058463 AF195766 AA680145 T86901 W60373 W60281 NM.007222AF1 06862 AI000795AA1 67 188 
AW884503 AW891313 AW891332 AW891312 AI984924 AI123518 N75170 AA131614 H25330 AI913358 A1742277 W25576 
R58771 AW445159 AW888628 AW888627 AW274674 AI088482 N52314 N34282 AW001769 AI338943 T66784 AI2889S3 
AW468676 AW237528 H25289 N71690 AA610128 AI143458 AI082599 N49144 AA854773 AW66341 1 AW610151 N47938 
AW601626 AA167189 AA918304 AA805205 BE069496 AA652836 BE069499 AI699298 AW249926 AW888578 BE567635 
T10726 AW604715 D54245 D53062 D55610 D55555 AA301376 AI133498 N77788 AI936320 AW090734 AI269977 N50828 
AA550814 AI421993 AI005384 N50813 D60292 D59349 AA131710 D81698 D81699 
AA331 156 AA331 157 AA331 155 

U76456 NM.003256 AF057532 AA193414 AW293304 AW963378 AA313095 AI359841 AI969312 AI080163 AW448926 
AI671136 BE466399 AI637967 AI671873 AW196583 AW071635 AI634427 AW296872 AW292470 AA193650 
BE161832 AA453224 AA485772 
D90391 M55575 A165226B AA719776 
AA524886 AW971347 AA211537 
AW971327 AA524988 AW628653 AA251797 
AW976796AA769520 
AA432071 AA405648 AW000908 T16347 

AB028957 AL120001 AI267678 H10928 R19844 AW970334 AA393182 F05472 F1 171 1 H09908 N50250 AI81541 1 BE463679 
D51468 AW970253 D60889 C15548 D61011 D60867 A1815795 AA534831 D81386 AW235039 AI382158 D81174 AA416899 
AA852310 H09789 H10929 H09813 F09369 R44721 D51515 Z38456 R14004 T66255 F12148 F12139 AW351702 M85350 
AI018713 AW972450 AW972645 AA514964 T66172 F09785 F09776 AA436608 T05327 T071 18 AA339352 
AW301608 N46706 AA649093 AA287595 AW81 1753 AA287596 N39260 

NM.001874 JO4970 T91426 AW205201 T84979 AA255727 AAB47837 R02164 T91339 AV651884 AV651835 AV651350 
AV6501 18 AV651338 A1272002 A1367796 AA830651 AA2621 12 AW151 198 
100401 24827.1 AU076696 AA219720 AL135197 AA305877 N56376 AA318063 AA130725 AW954903 BE541230 AW383312 U86753 085423 

AI679458 AI122932 AB007892 A1583919 BE160134 F08104 R34903 F13440 AA095444 AA262453 AA191036 R17895 
T81266 BE149776 AI279537 AI1431 13 AA361072 AW959030 AW268817 AA81 1533 BE275179 AI221677 T65147 R49293 
AA249176 BE000290 AA768053 F09494 BE092645 BE1 72099 241 177 AA044750 AI90976B BE140795 BE140574 AW845210 
AW752452 BE243244 AA843664 AI300080 BE169032 AW189979 BE004869 AA621872 AI951772 AI678897 AI926598 
N62813 AI350912 AW608791 AI309602 AI983138 AW875592 AI655073 AW875626 AA130606 AI370827 C75528 C75554 
AW263335 AI344426 BE0O4788 AA576220 AA604824 AI431405 AA749378 R38882 AW955075 AA173821 C75657 
AA219672 AW768408 R43141 AI431414 AA483343 A1673792 T17294 AW770187 N74285 AI476404 AI088288 AA654152 
AW974864 BE61731 1 BE243328 BE168049 
130542 28089.3 U64675 AW167507 AW167508 BE218568 AA779360 W85722 AL044843 BE159404 AF012086 AW893611 AW898610 

BE159405 BE092191 AW890826 AW369841 AW368064 AW606702 AL044731 R82691 AA419346 AA416558 H96045 
AL040450 AI640531 AI808434AL046613 AW855784 AW362469 AL048881 AL049015 AA094272 AA888908 AA4 17294 
AW237786 R59793 AL044916 D82402 AK16854 AI079342 H96406 AL037845 AI915900 AA972133 AI478783 T31074 
221135221396 AA352182R13918AA430178C17811 AI371824 AI742256 AA926801 N79156 AA350610AA081971 N83639 
R35544 AA312292 AW952080 N42322 AA1 71957 AA565297 R89207 AA504106 AI630782 AA826482 AI301579 T36241 
AW956618 228426 AL043480AI124636 AA393449T19504 AW887823AI289814 N53979 AL043571 A1632764 AI859613 
A1986308 AI683212 AI984499 A1133258 C05898 AW512761 AI041260 BE466240 219161 AI351 190 N67549 AJ373374 
AA400873 AW440914 AW514879 AA770146 AI358754 R51 1 13 AI283773 AA649886 T30543 D54358 R37750 T03358 
T15451 T15880 AA999689 N67396 AI056289 TB5597 N62441 R89099 RO0O35 T85596 R61335 R00128 N63359 A1535964 
100485 30576.2 AI207768 M31468 NM.012250 W0 1322 AA253280 AA253233 AA293148 AW582106 R79880 AA459547 AA363459 

AA234396 N31669 H44468 AA434587 AW363088 AW993541 
112277.6 AA070906AA070934 

19669.1 X51501 NM.002652 Y10179 J03460 AI791618 AI821473 AA916588 AA564296 AA9161 10 AI972286 A1420470 AJ568790 

AI597724 AW205207 AI659305 AI781620 AA532383 AI821475 AA526498 

32905.1 NM.012249 M31470 AL043108 AA262561 AA1 78883 T29433 AA313329 W48807 AW404323 AA453560 AW403227 H94816 
W17101 AA165152 W23989 AA091310 

23902.2 AL121734 D54896 AA424269 BE242906 AA362118 BE018454 AI280348 AL048769 M35543 AA757734 AI128865 H20289 
H23728 AI203445 H41481 H18237 H44081 H92839 AI928621 H75675 D51 148 AI796198 AW390453 D55579 D54145 053996 
D54015 R37664 H17541 AA668681 T65061 R15867 AW468123 R16049 H69030 AA054226 H16070 F09655 R92144 T03521 
R05473 H92840 AA01 8186 R91707 

14745.3 U35637 AA112989 219308 
genbank.N62602 N 62602 
entra*_Z84483 2B4483 
genbanleT92767 T92767 
entre?_W38197 W38197 



108345 
100522 

100533 

100598 



102332 
118250 
103678 
119400 
119559 
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TABLE 13: shows genes, including expression sequence tags, up-regulated in prostate tumor 
tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos Hu02 GeneChip 
array. Shown are the ratios of "average" normal prostate to "average" prostate cancer tissues. 



Pkey: 


Unique Eos probe set identifier number 




ExAccn: 


Exemplar Accession number, Genbank accession number 




UnigeneiD: 


Unigene number 




Unigene TUte: 


Unigene gene title 




R1: 


Background subtracted normal prostate : prostate tumor tissue 




Pkey ExAccn 


unigeneiu unigene True 


Hi 


333516 


Cn2Z_rGENcS. 1 73_i 


A AOQ 

U.UZo 


337954 


CH22__M AC005500.G EN SCAN .96-3 


0.029 


AAA J A A nWAAA 

332496 H 73299 


Hs.204354 ras homolog gene family; member B 


A no 
0.03 


337944 


CH22_EMAC005500.GENSCAN.89*7 


0.033 


334111 


CH22_FG ENES.330_1 0 


0.033 


333657 


CH22_FGENES.241_2 


0.034 ~ 


327718 


CH.04_ns gt)6525284 


0.034 


336355 


f\ | _AA ""/"> r k if— A4 - * r~ 

CH22_FGENES.817_5 


0.035 


322011 AL1 37354 


EST cluster (not in UniGene) 


0.035 


336377 


Cn22_FGENcS.821_5 


0.036 


300254 AW079607 


Hs.1 8841 7 ESTs; Weakly similar to ZnT-3 [H.sapiensJ 


0.037 


330096 


CH.19__p2gi|6015278 


0.037 


335191 


^llAA f~k P"A"T a 

CH22_FGENtS.507_6 


0.038 


334040 


A iiAA ff*A>f*_I^A AAA A 

CH22_FGENES.322_8 


0.039 


333586 


CH22_FGENES.204_2 


0.04 


333295 


Ai inn r- _ 1 A AAA A 

CH22_FGENES.132_2 


0.042 


313326 AI088120 


Hs.122329 ESTs 


0.043 


329517 


CH.10j>2gi|3983513 


0.043 


333403 


CH22_FGENES.144_21 


0.043 


335226 


CH22_FGENES.513J1 


0.044 


335976 


CH22_FGENES.652J 1 


0.045 


333637 


CH22_FGENES229_2 


0.046 


334582 


CH22_FGENES.407_5 


0.046 


336437 


CH22.FGENES.826J 


0.047 


337461 


CH22_FGENES.782-1 


0.047 


302892 N58545 


Hs.6975 histone deacetylase 3 


0.049 


338689 


CH22_EM:AC005500.G EN SCAN .475-3 


0.049 


334721 


CH22_FG ENES.42 1 _32 


0.049 


305867 AA864572 


EST singleton (not in UniGene) with exon hit 


0.049 


335498 


CH22_FGENES.571_7 


0.05 


oil 596 Aloo20oo 


hSxZJobo to IS 


vAJO 


326959 


CR21_hsgi|6469836 


0.051 


311688 AW025661 


Hs.240090 ESTs 


0.052 


317298 AI922374 


Hs.158549 ESTs 


0.052 


332984 


CH22_FGENES54_6 


0.052 


321039 AW247083 


EST duster (not in UniGene) 


0.053 


335844 


CH22_FGENES.623_4 


0.053 


325371 


CH.12 hsgi|5866920 


0.054 


335667 


CH22_FGENES.590J8 


0.054 


333635 


CH22J=GENES.228 2 


0.054 


336736 


CH22JK3ENES.110-2 


0.055 


335893 


CH22_FGENES.635_1 


0.055 


333170 


CH22.FGENES.94 5 


0.055 


329768 


CR14_p2gi|6015501 


0.055 


334030 


CH22_FGENES.320_2 


0.055 


323359 AA234172 


Hs.137418 ESTs 


0.055 


300453 AW051431 


Hs.1 13029 ribosomal protein S25 


0.055 


334262 


CH22.FGENES.367J2 


0.055 


306590 AI000246 


EST singleton (not in UniGene) with exon hit 


0.055 


331087 R22520 


Hs.23398 ESTs 


0.055 


338620 


CH22 - EM^C005500.GENSCAN.450-18 


0.056 


339045 


CH22_DA59H18.GENSCAN.28-5 


0.056 


308023 AI452732 


EST singleton (not in UniGene) with exon hit 


0.057 
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339067 CH22_DA59Hl8.GENSCAN.33-3 0.057 

335689 CH22_FGENES.596 4 0.057 

339069 CH22_DA59Ht8.GENSCAN.33-5 0.057 

338176 CH22_EM:AC005500.GENSCAN.2194 0.057 

5 328159 CH.06_hsgi|5868065 0.058 

335655 CH22_FGENES.590 6 0.058 

336371 CH22_FGENES.820 1 0.058 

336558 CH22_FGENES.B42 3 0.059 

337738 CH22_EMAC000097.GENSCAN.10O4 0.059 

10 334273 CH22_FGENES.369_2 0.059 

335889 CH22_FGENES.633_3 0.059 

327807 CH.05_hsgi|5867968 0.059 

333315 CH22_FGENES.138 7 0.059 

338825 CH22_D0246D7.GENSCAN.4-6 0.06 

15 337612 CH22_C20H12.GENSCAN.22-5 0.06 

333897 CH22.FGENES.293 4 0.06 

335990 CH22_FGENES.655_4 0.06 

334264 CH22_FGENES.367_15 0.06 

338653 CH22_EMiAC005500.GENSCAN.460-39 0.061 

20 322303 W07459 EST duster {not in UnlGene) 0.061 

333498 CH22_FGENES.168_8 0.061 

336522 CH22_FGENES.839_3 0.061 - 

301357 AW295677 Hs.137840 ESTs; Moderately similar to HOMEOBOX 

PROTEIN SIX1 [Haptens] 0.062 

25 305917 AA876469 Hs.1 81 357 laminin receptor 1 (67kD; ribosomal protein SA) 0.062 

336143 CH22_FGENES.705 5 0.063 

333493 CH22_FGENES.168_2 0.063 

332533 M99487 Hs.1915 folate hydrolase (prostate-specific membrane antigen) 1 0.063 

325844 CH.16_hsgi|6552453 0.063 

30 336402 CH22.FGENES.823 17 0.063 

335767 CH22_FGENES.607_1 0.064 

301893 T80334 EST cluster (not in UniGene) wift exon hit 0.064 

324019 AW1 77009 EST cluster (not in UniGene) 0.064 

305B01 AA845997 EST singleton (not in UniGene) with exon hit 0.064 

35 335188 CH22_FGENES.507 3 0.065 

337533 CH22_FGENES.82*2 0.065 

333311 CH22_FGENES.138 3 0.065 

335668 CH22_FGENES.590_19 0.065 

306786 AI041589 EST singleton (not in UniGene) with exon hit 0.066 

40 306365 AA962086 EST singleton (not in UniGene) with exon hit 0.066 

306249 AA933840 EST singleton (not in UniGene) with exon hit 0.066 

335018 CH22_FGENES.474 6 0.066 

333594 CH2a_FGENES510 3 0.066 

333900 CH22_FGENES.293_7 0.066 

45 325207 CH.10_hsgi|6552430 0.067 

329888 CH.15jj2gi|6067149 0.067 

326238 CH.17_hsgi}5B67260 0.067 

333658 CH22_FGENES.241_4 0.067 

335809 CH22_FGENES.617_6 0.068 

50 307427 AI243437 EST singleton (not in UniGene) with exon hit 0.068 

318428 AI949409 H&224583 ESTs 0.069 

327005 CH.21_hsgI|5867664 0.069 

330463 HG998-HT998 Sulfotransferase, Phenol-Preferring " 0.069 

333318' CH22_FGENES.138 10 0.07 

55 333313 CH22_FGENES.138 5 0.07 

325937 CH.16_hsgi|5867132 0.07 

335663 CH22_FGENES590_14 0.07 

335349 CH22_FGENES.539_2 0.07 

303396 AA224470 Hs.25426 ESTs; Weakly similar to unknown [H.sapiens) 0.07 

60 332603 N66681 Hs.33470 ESTs 0.07 

333310 CH22_FGENES.138_2 0.071 

309924 AW340812 EST singleton (not in UniGene) with exon hit 0.071 

336340 CH22_FGENES.B14_15 0.071 

308025 A1453365 Hs.172928 collagen; type I; alpha 1 0.071 

65 306805 AI055966 EST singleton (not in UniGene) with exon hit 0.071 

335499 CH22_FGENES571_8 0.071 

329669 • CH.14_p2 giJ6272t29 0.071 

321666 D28390 EST duster (not in UniGene) 0.071 

338174 CH22_EM:AC005500.GENSCAN.219-2 0.072 
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336558 CH22_FGENES.842_1 0.072 

305451 AA738105 Hs.1 40 immunoglobulin gamma 3 (Gm marker) 0.072 

336684 CH22_.FGENES.46-1 0.072 

_ 326943 CH.21Jisgi|6004446 0.073 

5 333947 CH22_FGENES.303J 0.074 

333214 CH22_FGENES.104_5 0.074 

331917 AA446572 Hs.174007 ESTs; Moderately similar to !!!! ALU SUBFAMILY J WARNING 0.074 

339102 CH22.0A59H18.GENSCAN.44-9 0.074 

328122 CH.06_hsgi|5868031 0.075 

10 332250 N62712 Hs.226223 KIAA061 8 gene product 0.075 

328506 CH.07_hsgij586B471 0.075 

331756 AA291468 Hs.98504 ESTs 0075 

335193 CH22_FGENES.507_8 0.076 

317729 AA971718 Hs.128141 ESTs 0076 

15 304515 AA458708 Hs.251577 hemoglobin; alpha 2 0076 

313644 A1565766 Hs.124960 ESTs 0076 

326145 CK17_hsgi|5867204 0.076 

336394 CH22_FGENES.823_6 0.077 

306516 AA989542 EST singleton (not in UniGene) with axon hit 0.077 

20 300629 AA1521 19 Hs.155101 ATP synthase; H+ transporting; mitochondrial F1 complex; alpha subunit; 

isoform 1 ; cardiac muscle 0.077 

333160 CH22_FGENES.9t_2 0.077 . 

337490 CH22.FGENES.799-5 0.077** 

305403 AA723748 EST singleton (not in UniGene) with exon hit 0.077 

25 331747 AA281765 Hs. 193689 ESTs 0077 

332792 CH22.FGENES.3_2 0.078 

330513 M81057 Hs. 1 80884 carboxypeptidase B1 (tissue) 0.078 

308905 AI859636 Hs.8102 ribosomal protem S20 0.078 

337419 CH22.FGENES.759-4 0.078 

30 333459 CH22J=GENES.157_8 0078 

334851 CH22_FGENES.440JJ 0.078 

329046 CHXhsgi|5868569 0.078 

327879 CH.06_hsgi|5868142 0.079 

305830 AA857665 EST singleton (not in UniGene) with exon hit 0.079 

35 302928 AL137719 EST cluster (not in UniGene) with exon hit 0.079 

304321 AA1 36698 Hs.1 13029 ribosomal protein S25 0.079 

326390 CH.1 9_hs gi|5867340 0.079 

335230 CH22_FGENES.514_2 008 

334622 CH22_FGENES.412_6 0.08 

40 335331 CH22_FGENES.535_4 0.08 

304753 AA578840 Hs.77961 major histocompatibility complex; class I; B 0.08 

301863 AI418863 EST duster (not in UniGene) with exon hit 0.081 

336561 CH22.FGENES.842 6 0.081 

AC 335611 CH22_FGENES.583_5 0.081 

45 305060 AA635771 EST singleton (not in UniGene) with exon hit 0.081 

306051 AA905130 EST singleton (not in UniGene) with exon hit 0.082 

308289 A1571211 EST singleton (not in UniGene) with exon hit . 0.082 

334365 CH22_FGENES.378_13 0082 

335496 CH22_FGENES.571_4 0.082 

50 332634 S38953 Human unidentified gene complementary to P450c21 

gene; partial cds 0.082 

337824 CH22_EM:AC005500.GENSCAN.13-18 0.082 

335822 CH22.FGENES.619 7 0.082 

334758 CH22.FGENES.428 7 0.082 

55 309641 AW194230 Hs.253100 EST 0.082 

333064 CH22_FGENES.75_7 0.083 

338695 CH22_EM:AC005500.GENSCAN.477-25 0.083 

331809 AA402482 Hs.97312 ESTs 0.083 

326138 CH.17_hsgi|5867203 0.083 

60 328304 CH.07_hsgi|6004478 0.083 

330570 U60276 Hs.1 65439 arsA (bacterial) arsenite transporter; ATP-binding; homolog 1 0.083 

334305 CH22_FGENES^73_8 ' 0.083- 

335885 CH22_FGENES.632_3 0.083 

325839 CH.16_hsgi|6552452 0.083 

65 333531 CH22_FGENES.175_18 0.084 

330385 AA449749 Hs.31386 ESTs; Highly simOar to secreted apoptosis related protein 

1 [H^apiens] 0.084 

323305 AA811351 Hs.25307 Homo sapiens done 24812 mRNA sequence 0.084 

331698 Z39929 Hs.65843 ESTs 0.084 
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335888 CH22_FGB!ES.633_2 0.084 

306008 AA894390 EST singleton (not In UniGene) with exon hit 0.084 

334249 CH22 FGENES.365J5 0.084 

318303 AW451197 Hs.1 13418 ESTs 0.084 

•5 330171 CH.0£j)2gi|6648220 0.084 

336662 CH22_FGENES.41-1 0.085 

320506 AI815668 Hs.157476 sud -associated neurotrophic factor target 2 

(FGFR signaEng adaptor) 0.085 

316974 AI740721 Hs.128292 ESTs 0.085 

10 336492 CH22_FGENES.832J 0.085 

335750 CH22_FGENES.602_4 0.085 

335676 CH22.FGENES.594_1 0.086 

336093 CH22_FGENES.691_2 0.086 

310932 AI933861 Hs.222852 ESTs 0.086 

15 335160 CH22_FGENES.502_4 0.086 

334306 CH22_FGENES.373_9 0.086 

334793 CH22_FGENES.433_5 0.086 

333936 CH22 FGENES.301_2 0.087 

336413 CH22_FGENES.823_35 0.087 

20 333775 CH22.FGENES.272J 0.087 

335971 CH22_FGENES.652_4 0.087 

301737 AJ8159B1 EST cluster (not in UniGene) with exon hit 0.087 

339101 CH22_DA59H18.GENSCAN.44-6 0.087 ~ 

327612 CH.04_hsgij6525283 0.087 

25 326241 CH.17_hsgi|5867260 0.088 

338386 CH22JEMAC005500.GENSCAN.331-4 0.088 

327762 CH.05_hsgi]5867961 0.088 

305266 AA679772 EST singteton (not in UniGene) with exon hit 0.088 

334359 CH22_FGENES.378_4 0.088 

30 335500 CH22_FGENES.571J0 0.088 

329687 CH.14j)2gi|6117856 0.088 

333654 CH22_FGENES.240_2 0.088 

324430 AA464018 EST cluster (not in UniGene) 0.088 

325999 CH.16_hsgi|5867073 0.089 

35 334832 CH22_FGENES.439_1 0.089 

339115 CH2£J)A59H18.GENSCAN.49-3 0.089 

300896 AI916902 Hs.213882 ESTs 0.089 

328784 CH.07_hsgi|5868309 0.089 

335044 CH22.FGENES.480J 0.089 

40 329791 CH.14_p2gi[6469354 0.089 

333656 CH22_FGENES240J 0.089 

326180 CH.17_hsgi|5867211 0.089 

333391 CH22 FGENES.144_6 0.089 

338324 CH22_EMlAC005500.GENSCAN.306-3 0.089 

45 305396 AA721052 EST singleton (not in UniGene) with exon hit 0.089 

337483 CH22J=GENES.795-7 0.09 

326424 CH.19_hsgi|5867369 0.09 

306454 AA977992 EST singteton (not in UniGene) with exon hit 0.09 

338893 CH22 DJ32i10.GENSCAN.7-6 0.09 

50 327470 CH.02_hsgi|5867772 0.09 

333165 CH22_FGENES.91_7 0.09 

307155 A1186738 Hs.182426 ribosomal protein S2 0.09 

330717 AA233926 Hs.23635 ESTs - 0.09 

335334 CH22.FGENES.535J0 0.09 

55 335907 CH22_FGENES.636_2 0.09 

333885 CH22.FGENES.292J 0.09 

331034 N51868 Hs.31965 ESTs; Moderately similar to 40S RIBOSOMAL 

PROTEIN S20 (Rsapiens] 0.09 

304660 AA534416 Hs.162185 ESTs 0.09 

60 328217 CH.06_hsgil5868096 0.091 

336068 CH22_FGENES.684_13 0.091 

302833 AA295381 Hs.44423 ESTs 0.091 

328668 CH.07_hsgi|5868254 0.091 

335309 CH22_FGENES.532_2 0.091 

65 338481 CH22_EM:AC0G55O0.GENSCAN.377-5 0.091 

306286 AA936892 EST singteton (not in UniGene) with exon hit 0.091 

305070 AA639783 EST singteton (not in UniGene) with exon hit 0.091 

304870 AA594811 Hs.1 19122 nbosomal protein L1 3a 0.091 

303856 AA968589 Hs.944 glucose phosphate isomerase 0.091 
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323789 AI459812 Hs.170460 ESTs; WeaWy similar to K1AA0990 protein (H.sapiensl 0.092 

334910 CH22_FGENES.455_3 0.092 

325382 CH.19_hsgi|5867327 0.092 

332467 AA489630 Hs. 119004 KIAA0665 gene product 0.092 

5 338534 CH22 EM*C005500.GENSCAN.402«7 0.092 

336449 CH22_FGENES.829_6 0.092 

333709 CH22_FGENES.250_24 0.092 

336559 CH22_FGENES.842_4 0.092 

333230 CH22_FGENES.107_10 0.093 

10 333133 CH22_FGENES.83_9 0.093 

334885 CH22J=GENES.451J1 0.093 

330605 X02419 Hs.77274 plasminogen activator; urokinase 0.093 

336392 CH22_FGENES.823_4 0.093 

334083 CH22.FGENES.327_38 0.093 

15 325469 CH.12_hsgij6017034 0.093 

331077 R09531 Hs.19039 ESTs 0.093 

303701 AW500732 EST cluster (not in UniGene) with exon hit 0.093 

334218 CH22,FGENES.358_3 0.093 

336542 CH22 FGENES340_6 0.093 

20 337151 CH22 - FGENES546-t 0.093 

333642 CH22_FGENES.231_2 0.093 

336863 CH22_FGENES^97-4 0.093 . 

334680 CH22_FGENES.419_2 0.093 " 

326365 CH.18_hsgi|5867297 0.093 

25 338952 CH22JXI32I10.GENSCAN.23-22 0.093 

337539 CH22.FGENES.832-4 0.094 

333546 CH22J=GENES.180_2 0.094 

335258 CH22 FGENES.518.3 0.094 

336786 CH22.FGENES.168-19 0.094 

30 321644 AI204177 Hs.237396 ESTs 0.094 

335943 CH22_FGENES.646_17 0.094 

327918 CH.06jisgi|5868165 0.094 

306398 AA970548 EST singleton (not in UniGene) with exon hit 0.094 

335671 CH22_FGENES.592_3 0.094 

35 335033 CH22,FGENES.475J1 0.094 

338277 CH22,EMAC005500.GENSCAN^90-2 0.094 

332061 AA504812 Hs.192824 early B-ceB factor 0.094 

305153 AA654582 Hs.77039 ribosomal protein S3A 0.094 

333880 CH22_FGENES.292_2 0.094 

40 323940 AIB64428 Hs. 170880 ESTs 0.094 

313779 AA648796 Hs.129771 ESTs 0.095 

323109 AA169345 EST duster (not in UniGene) 0.095 

332930 CH22 FGENES.38^4 a095 

335368 CH22_JGENES543_6 0.095 

45 303887 R72672 Hs.193484 ESTs; WeaWy similar to Similarity with yeast gene 

L3502.1 [Celegans] 0.095 

336223 CH22_FGENES.727_3 0.095 

311280 AI767957 Hs.197737 ESTs; Weakly similar to Y38A8.1 gene product [Celegans] 0.095 

337256 CH22^FGENES.648-3 0.095 

50 308814 AI819263 EST singleton (not in UniGene) with exon hit 0.095 

334659 CH22_FGENES.418_7 0.095 

335895 CH22_FGENES.635_3 0.095 

321697 AW388061 Hs.4953 golgl autoantigen; golgtn subfamily a; 3 - 0.095 

336010 CH22_FGENES.668_8 0.096 

55 302824 U21260 EST duster (not in UniGene) with exon hit 0.096 

333612 CH22_FGENES.217_7 0.096 

304823 AA584837 EST singleton (not in UniGene) with exon hit 0.096 

335665 CH22_FGENES.590J6 0.096 

306518 AA989598 EST singleton (not in UniGene) with exon hit 0.096 

60 335243 CH22_FGENES.516_4 0.096 

335436 CH22_FGENES.559_5 0.096 

300243 A1420256 Hs.1 61271 ESTs 0.096 

332810 CH22^FGENES.7_12 0.097 

308612 AI735634 EST singleton (not in UniGene) with exon hit 0.097 

65 335818 CH22,FGENES.618^6 0.097 

325838 CH.16 hsgil6552452 0.097 

337482 CH2^FGENES.795-6 0.097 

336645 CH22 FGENES.26-1 0.097 

337293 CH22.FGENES.675-1 0.098 
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329893 CH.15_p2gi|6525313 0.098 

326533 CH.19 hsgii5867441 0.098 

334905 CH22_FGENES.452_20 0.098 
306347 AA961144 EST singleton (not In UniGene) with exon hit 0.098 

5 336676 CH22_FGENES.434 0.098 

339166 CH22 OA59H18.GENSCAN.69-7 0.098 

335774 CH22 FGENES.607J0 0.098 

339216 CH22_FF113D11.GENSCAN.6-11 0.098 

335311 CH22 FGBJES.532_4 0.098 

10 329632 CH.11 _p2 gi|6729060 0.098 

328595 CH.07 hs 01)5868224 0.098 

326928 CR21 hsgil6456782 0.098 

315234 AI079680 Hs.120770 ESTs 0.098 

306082 AA908508 EST singleton (not In UniGene) with exon hit 0.098 

15 305710 AA826544 EST singleton (not in UniGene) with exon hit 0.098 

318540 T30280 EST duster (not in UniGene) 0.099 

337553 CH22_C4G1.GENSCAN.2-1 0.099 

320951 AA344069 Hs£02699 neurexophffin 4 0.099 

303845 T08033 EST cluster (not in UniGene) with exon hft 0.099 

20 338981 CH22_QA59H10.GENSGAN.2-5 0.099 

321313 R87365 Hs.26058 ESTs; Weakly sirnflar to p532 [Ksapiens] 0.099 

328348 CH.07_hsgi|5868383 0.099 

332203 H49388 Hs.102082 EST 0.099 

301780 R07064 EST cluster (not in UniGene) with exon hit 0.099 

25 332095 AA608838 Hs.162681 EST 0.099 

333227 CH22 FGB<ES.107_5 0.099 

316442 AA760894 Hs.153023 ESTs 0.099 

326001 CH.16_hs gi|5867073 0.099 

334363 CH22_FGENES.378_11 0.099 

30 338895 CH22_DJ32U0.GENSCAN.9-2 0.099 

327460 CH.02_hsgi|6004455 0.099 

332705 T59161 Hs.76293 thymosin; beta 10 0.1 

307806 AI351739 EST singleton (not in UniGene) with exon hit 0.1 

322800 F25037 Hs.225175 ESTs 0.1 

35 304918 AA602697 EST singleton (not in UniGene) with exon hit 0.1 

334327 CH22_FGENES.375_4 0.1 

318359 A1097439 Hs.135548 ESTs 0.1 

326644 CH.20 hsgi|5B67559 0.1 

334454 CH22_FGENES.388_3 0.1 

40 327959 CH.06_hsgi|5868210 0.1 

323783 AA330586 Hs.131819 ESTs 0.1 

309198 AI955915 Hs.248038 major histocompatibility complex; class I; C 0.1 

339265 CH22_BA354l12.GENSCAN.10-3 0.1 

320576 AL049977 Hs.162209 Homo sapiens mRNA; cDNA DKFZp564C122 

45 (from done DKFZp564C122) 0.1 

338132 CH22_Bfl^C005500.GENSCAN^0O-2 0.1 

333163 CH22_FGBIES.91_5 0.101 

337584 CH22_C20H12.GENSCAN.5-1 0.101 

307588 AI285535 EST singleton (not in UniGene) with exon hit 0.101 

50 336969 CH22_FGENES.378-2 0.101 

327535 CH.Q2_hsgi|6525279 0.101 

328732 CH.07_hsgi]5868289 0.101 

336686 CH22_FGENES.46-3 • 0.101 

335777 CH22LFGENES.607J3 0.101 

55 332944 CH22_FGENES.47_3 0.101 

333174 CH22_FGENES.95_1 0.101 

336380 CH22_FGENES.821_8 0.101 

330571 U60800 Hs.79089 serna domain; immunoglobufin domain (Ig); 

cytoplasmic domain; (semaphorin) 4D 0.101 

60 331789 AA398721 Hs.1 86749 ESTs 0.101 

338915 CH22 DJ32I10.GENSCAN.12-1 0.101 

334844 CH22 FGENES.439.24 0.101 

336642 CH22 FGENES.23-4 0.101 

334906 CH22 FGENES.452_21 0.101 
65 333188 CH22_FGENES.98_8 0.101 

300088 AW299993 EST cluster (not In UniGene) with exon hit 0.101 

329373 CHJ(Jtsgil6682537 0.102 

331120 R46576 Hs.23239 ESTs 0.102 

335856 CH22_FGENES.628J 0.102 
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331888 AA431337 Hs.98017 ESTs 0.102 

333154 CHZLFGENES.89_4 0.102 

335989 CH22__FGENES.655_2 0.102 

304385 AA235602 EST singleton (not UniGene) with exon hit 0.102 

5 338016 CH22 - EM^C005500.GENSCAN.133-1 O.102 

335190 CH22_FGENES.507_5 0.102 

318595 T39486 Hs.6137 ESTs . 0.102 

333697 CH22_FGENES.250J1 0.102 

306526 AA989713 EST singleton (not in UniGene) with exon hit 0.103 

10 328734 CH.07_hsgi]5868289 0.103 

307294 AI205612 Hs.73742 ribosomal protein; large; PO 0.103 

327424 CH.02_hsgi|5867751 " 0.103 

335872 CH22_FGENES.630_3 0.103 

333572 CH22_FGENES.189_1 0.103 

15 334774 CH22_FGENES.430_6 0.103 

338660 CH22_EMAC005500.GENSCAN.46M 0.103 

326713 CR20LhsgiI5867595 0.103 

333994 CH22.FGENES.310J8 0.103 

335800 CH22_JGENES.613_4 0.103 

20 318113 AI187943 Hs.132322 ESTs 0.103 

337278 CH22_FGENES.665-1 0.103 

336386 CH22_FGENES.822_6 0.103 . 

334790 CH22.FGENES.432J5 0.103 ~ 

303778 AW505368 EST cluster (not in UniGene) with exon hit 0.104 

25 336524 CH22_FGENES.839_5 0.104 

328938 CH.08_hsgi|5868500 0.104 

335102 CH22_FGENES.494_7 0.104 

300935 AA513644 Hs.222815 ESTs; Weakly similar to Wiskott-Aldrich Syndrome 

protein [Rsapiens] 0.104 

30 307581 AI284415 EST singleton (not m UniGene) with exon hit 0.104 

317301 AW291683 Hs.226056 ESTs • 0.104 

335330 CH22_FGENES.535_3 . 0.104 

337968 CH22_EMAC005500.GENSCAN.103-2 0.104 

335627 CH22_FGENES.584_7 0.104 

35 336274 CH22J=GENES.762_2 0.104 

334730 CH22 FGENES.424_5 0.105 

334409 CH22_FGENES.383_6 0.105 

327237 CH.01_hsgi|5B67544 0.105 

333321 CH22.FGENES.138J3 O.105 

40 303181 AA452366 EST cluster (not in UniGene) with exon hit 0.105 

333738 CH22_FGENES.261_2 0.105 

338255 CH22_EM:AC005500.GENSCAN.276-3 0.105 

334282 CH22_FGENES.369J2 0.105 

330190 CH.05_p2gil6165182 0.105 

45 310748 AW014249 Hs.15B698 ESTs 0.105 

338150 CH22_EM:AC005500.GENSCAN.207-2 0.105 

336719 CH22_FGENES.82-6 0.105 

330228 CH.05_p2gi|6013527 0.105 

327801 CH.05_hsgi|5867924 0.105 

50 330525 S75168 Hs.274 megakaryocytB-assodated tyrosine kinase 0.105 

334972 CH22_FGENES.468_2 0.105 

335111 CH22 FGENES.494J9 0.106 

334483 CH22_FGENES.395_5 - 0.106 

328829 CH.07_hsgil5868337 0.106 

55 302753 M74299 EST cluster (not in UniGene) with exon hit 0.106 

334512 CH22_FGENES.398_10 0.106 

330024 CH.16j)2gi|6671908 0.106 

321030 AI769930 Hs.233617 Homo sapiens (clone B3B3E13) Huntington's 

disease candidate region 0.107 

60 338410 CH22 EMAC005500.GENSCAN.341-6 0.107 

334353 CH22J r GBIES.376_5 0.107 

338276 CH22 EM:AC005500.GENSCAN.288-9 0.107 

329053 CHJ(_hsgi|5868574 0.107 

336560 CH22_FGENES.842_5 0.107 

65 332158 AA621363 Hs.1 12980 EST 0.107 

336447 CH22_FGENES.829_4 0.107 

333703 CH22-FGENES^50J7 0.107 

326207 CH.17_hsgI|5867222 0.107 

333232 CH22_.FGENES.108J 0.107 
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334802 CH22_FGBJES.435_1 0.107 

303784 AA704983 EST duster (no! In UniGene) with axon hit 0.107 

338847 CH22_OJ246D7.GENSCAN.10-2 0.107 

339407 CH22 DJS79N16.GENSCAN.1-9 0.108 

5 337635 CH22_C20H12.GENSCAN.32-8 0.108 

334650 CH22_FGENES.417_17 0.108 

308511 AI687580 EST singleton (not in UniGene) with exon hit 0.108 

333392 CH22_FGENES.144 8 0.108 

325840 CH.16_hsgi|6552452 0.108 

10 315044 AW205664 Hs.129568 ESTs 0.108 

333298 CH22_FGENES.133_4 0.108 

335157 CH22 FGENES.501 7 0.108 

333305 CH22.FGENES.137 2 0.108 

326379 CH.19_hsgi|5867327 0.108 

15 335050 CH22_FGENES.482_1 0.108 

305185 AA663985 Hs£48038 major histocompatibility complex; class I; C 0.108 

335658 CH22_FGENES.590_9 0.108 

323040 AA336609 Hs. 10862 ESTs 0.108 

337326 CH22_FGENES.699-6 0.108 

20 339262 CH22_BA354I12.GENSCAN.*6 0.108 

321202 H54052 Hs.163639 ESTs; Weakfy similar to INTERCELLULAR ADHESION 

MOLECULE-1 PRECURSOR [Haptens] 0.109 

331792 AA398968 Hs37548 EST 0.109 

333806 CH22J=GENES.278_2 0.109 

25 321325 AB033100 EST duster (not in UniGene) 0.109 

331373 AA435513 Hs.178170 ESTs; Weakly similar to DUAL SPECIFICITY 

PROTEIN PHOSPHATASE 3 0.87 

328775 CH.07_hsgi|5868309 0.109 

335105 CH22_FGENES.494_10 0.109 

30 300975 AI283548 Hs.149668 ESTs 0.109 

324893 T31940 EST duster (not in UniGene) 0.109 

333397 CH22_FGENES.144J5 0.109 

336484 CH22_FGENES.831_3 0.109 

335507 CH22_FGENES.571_22 0.109 

35 336373 CH22_FGENES.820 3 0.109 

336188 CH22_FGENES.717J2 0.109 

313455 AW081702 Hs.137329 ESTs 0.109 

335185 CH22_FGENES.506_4 0.109 

306814 AI066577 EST singleton (not in UniGene) with exon hit 0.109 

40 311130 AI632322 Hs.195306 ESTs 0.109 

310882 AW080339 Hs.211911 ESTs 0.109 

323383 AI346359 Hs.135209 ESTs 0.11 

300212 AW135925 Hs.184552 biphenylhydrolase-like (serine hydrolase; breast epithelial 

mudn-assoc 0.11 

45 325675 CH.14 hsgi|5867014 0.11 

330095 CH.19_p2gi|6015278 0.11 

331942 AA453261 Hs.99309 ESTs 0.11 

334723 CH22_FGBYES.421_34 0.11 

333614 CH22_FGENES217J 0.11 

50 337316 CH22.FGENES.692-1 0.11 

305057 AA635626 Hs.62954 ferritin; heavy polypeptide 1 0.11 

338704 CH22_EM^C005500.GENSCAN.480-3 0.11 

335385 CH22_FGENES.543_27 - 0.11 

338012 CH22„EM^C005500.GENSCAN.128-10 0.11 

55 329449 CH.Y_hsgil5868886 0.11 

338980 CH22_DA59H18.GENSCAN.2-4 0.11 

336553 CH22_FGENES.841_10 0.111 

330021 CH.16_p2gg6671889 0.111 

327579 CH.03 hsgi|5867824 0.111 

60 333099 CH22 FGENES.79 4 0.111 

337076 CH22_FGENES.453-4 0.111 

331388 AA456852 Hs.43543 suppressor of white apricot homotog 2 0.111 

306674 AI005542 Hs. 1604 14 heat shock 70kD protein 10 (HSC71) 0.111 

305949 AA884409 EST singleton (not in UniGene) with exon hit 0.111 

65 330748 AA419217 Hs.15911 DKFZP586E 1422 protein 0.111 

333780 CH22_FGENES.273_2 0.111 

323676 AI702835 EST duster (not in UniGene) 0.111 

308952 AI868157 Hs224226 EST 0.111 

309338 AW026946 Hs.181 1 65 eukaryotic translation elongation factor 1 alpha 1 0.1 1 1 
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329317 CHX hsgiJ6381976 ai12 

333518 CH22_FGENES.173J3 0.112 

306982 A1127883 EST singleton (not in UniGene) with exon hit 0.112 

336225 CH22_FGENES.728_2 0.112 

5 333698 CH22.FGENES.250 J2 0.1 12 

302173 AI417947 Hs.14068 ESTs 0.112 

335510 CH22 FGENES.571 25 0.112 

328042 CH.06Jisgi|5902482 0.112 

336512 CH22_FGENES.834J 0.112 

10 328541 CH.07_hsgi|5868486 0.112 

311265 AW205118 Hs.199214 ESTs 0.112 

323218 AF131846 Hs.13396 Homo sapiens clone 25028 mRNA sequence 0.112 

302002 AF013956 Hs.123085 chrorix)boxhomotog4(DrosophilaPcclass) 0.112 

315088 AA557351 Hs.152448 ESTs; Moderately sim8ar to MULTIFUNCTIONAL PROTEIN ADE2 0.1 12 

15 312581 AI937242 Hs.176590 ESTs 0.112 

322246 AW384710 Hs.125258 ESTs 0.112 

333659 CH22_FGENES241_5 0.113 

327510 CH.02_hsgiI6117815 0.113 

336520 CH22.FGENES.839J 0.1 1 3 

20 338682 CH22 EM:AC005500.GENSCAN. 472-1 0.113 

334508 CH22_FGENES.398J 0.113 

322533 T59538 EST duster (not In UniGene) 0.113 

306873 AI086929 EST singleton (not in UniGene) with exon hit 0.113 

336040 CH22_FGENES.679_2 0.113 

25 303898 T23215 EST cluster (not in UniGene) wiih exon hit 0.113 

312011 AW294868 Hs.187226 ESTs 0.113 

335186 CH22 FGENES.506 5 0.113 

333607 CH22_FGENES.216_2 0.113 

305549 AA773530 EST singleton (not in UniGene) with exon hit 0.113 

30 333686 CH22.FGENES.249 J 0.1 13 

334352 CH22_FGENES.376_3 0.113 

338195 CH22_EM^C005500.GENSCAN^33-18 0.114 

333588 CH22.FGENES206J 0.114 

339233 CH22_BA354l12.GENSCANJ2-3 0.114 

35 337455 CH22J=GENE$.777-1 0.114 

309101 A1925108 EST singleton (not in UniGene) with exon hit 0.114 

328522 CH.07Jisgi|5868477 0.114 

323999 AI537333 Hs.252782 ESTs 0.114 

333517 CH22_FGENES.173_2 0.114 

40 329935 CH.16_p2 gi|6165200 0.114 

326226 CH.17_hsgi|5867230 0.114 

335890 CH22 FGENES.633J 0.114 

336715 CH22 FGENES.77-1 0.114 

327640 CH.04_hs gi|5867890 0.1 14 

45 338842 CH22_OJ246D7.GENSCAN.7-1 0.114 

306534 AA991487 EST singleton (not in UniGene) with exon hit 0.114 

336597 CH22_FGENES.266J 0.114 

321010 Y17456 H&227150 Homo sapiens LSFR2 gene; last exon 0.114 

302294 AA159213 Hs.5337 isodtrate dehydrogenase 2 (NADP+); mitochondrial 0.114 

50 324895 N44238 Hs.77515 Inositol 1 ^triphosphate receptor; type 3 0.114 

327358 CH.01_hsgi|6552411 0.114 

308792 AI815153 Hs.195188 glyceraldehyde-3-phosphate dehydrogenase 0.115 

325886 CH.16_hsgi|5867087 - 0.115 

336850 CH22.FGENES272-11 0.115 

55 305858 AA863103 EST singleton (not In UniGene) with exon hit 0.115 

302569 AC004472 multiple UniGene matches 0.115 

336158 CH22.JGENES.707J 0.115 

327866 CH.06 hsgi|5868131 0.115 

339157 CH22_DA59H18.GENSCAN.67-3 0.115 

60 339258 CH22_BA354l12.GENSCAN.8-3 0.115 

336129 CH22 FGENES.701J7 0.115 

333684 CH2^FGENES_M9J 0.115 

309618 AW190162 Hs.184776 ribosomal protein L23a 0.115 

312926 AA954097 Hs.127523 ESTs 0.115 

65 302640 AB035698 EST duster (not in UniGene) with exon hit 0.115 

328988 CH.08 hs gi|6456775 0.1 15 

327902 CH.06_hsgil5868158 0.115 

321927 AJ223366 EST duster (not in UniGene) 0.115 

335962 CH22_FGENES.651j4 0.115 
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334927 CH22.FGENES.460J 0.115 

330535 U11872 Human lnterteukin-8 receptor type B (IL8RB) mRNA, 

splice variant IL8RB1 0.856 

328591 CH.07Jisgil5868227 0.115 

5 334902 CH22_FGENES.452_16 0.115 

328525 CH.07_h$gi|5868482 0.115 

325870 Oi16_hsgi|6682492 0.116 

337522 CH22.FGENES.819-1 0.116 

305079 AA641329 EST singleton (not in UniGene) with exon hit 0.116 

10 327343 CH.01 hsgi[6017017 0.116 

333918 CH22 FGENES.296 7 0.116 

333600 CH22_FGENES.213_2 0.116 

335846 CH22_FGENES.623_6 0.116 

333510 CH22 FGENES.171_4 0.116 

15 327629 CH.04"h$gi|5867872 0.116 

333470 CH22.FGENES.16U 0.116 

326855 CH.20Ji$gi|6552460 0.116 

327008 CH.21jY$oj|5667664 0.117 

337480 CH22_FGENES.795-3 0.117 

20 336425 CH22.FGENES.824J0 0.117 

321964 AL079587 Hs.171065 ESTs 0.117 

335651 CH22_FGENES.590_2 0.117 

308164 AI521574 Hs.181 165 eukaryotic translation etongation factor 1 alpha 1 0.117 

337927 CH22 EM:AC005500.GENSCAN.80-3 0.117 

25 300341 H45095 Hs.153524 ESTs 0.117 

300154 AI245127 Hs.179331 ESTs 0.117 

306295 AA937331 EST singleton (not in UniGene) with exon hit 0.1 17 

329670 CH.14_p2gi|6272129 0.117 

335612 CH22J=GENES.583_6 0.117 

30 307845 AI363450 EST singleton (not in UniGene) with exon hit 0.117 

330401 D28383 Human mRNA lor ATP synthase B chain, 5'UTR (sequence from the 

5'cap to the start cod on) 0.1 1 7 

327127 CH51_hsgi|6682520 0.117 

333843 CH22.FGENES.290J 0.117 

35 331083 R17762 Hs£2292 ESTs 0.117 

329140 CHJC.hsgil6017060 0.117 

339338 CH22_BA354l12.GENSCAN.27-3 0.117 

331974 AA464518 Hs.99616 ESTs 0.117 
338631 CH22 EM:AC005500.GENSCAN.454-2 ■ 0.117 

40 330299 CH.06_p2gi|2905881 0.117 

330351 CH.09_p2gi|3056622 0.117 

305377 AA715714 Hs.181357 bminin receptor 1 (67kD; ribosomal protein SA) 0.117 

333106 CH22 FGENES.79 12 0.117 

3385U CH22 EM.ACOQ5500.GENSCAN.3924 0.117 

45 327335 CH.01_hsgi|5902477 0.117 

301970 AB028962 Hs.120245 KIAA1039 protein 0.118 

326339 CH,17Jisgi|6Q56311 0.118 

330612 X15673 Hs.93174 Human endogenous retrovirus pHE.1 (ERV9) 0.118 

334178 CH22 FGENES.350.6 0.118 

50 328008 CH.06Jisgi[5902482 0.118 

329976 CH.16j>2 gi|4878063 0.1 18 

320952 AA897432 Hs.130411 ESTs 0.118 

305621 AA789095 EST singteton (not in UniGene) with exon hit - 0.118 

337850 CH22 EWtAC005500.GENSCAN.34-3 0.118 

55 333626 CH22 FGENES.224_2 0.118 

337672 CH22_EM:AC000097.GENSCAN.67-1 0.118 

328803 CH.07_hsgi|6004475 0.118 

325922 CH.16_hsgi|5867122 0.118 

334489 CH22.FGENES.397J 0.118 

60 320638 R54766 Hs.101120 ESTs 0.118 

321932 AA569229 EST cluster (not in UniGene) 0.118 

336958 CH22_FGENES.367-1 0.118 

332082 AA600176 Hs.1 12345 ESTs 0.118 

306004 AA889992 EST singteton (not in UniGene) with exon hit 0.118 

65 336803 CH22_FGENES.194-1 0.118 

309107 AI925823 EST singleton (not in UniGene) with exon hit 0.118 

336859 CH22_FGENES593-9 0.118 

337935 Oi22_BMC005500.GENSCAN.85-6 0.118 

326492 CH.19_hsgi|5867422 0.118 
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327289 CH.0Lhsgi|5867481 0.119 

325818 CH.14_hs 9^6682490 0.119 

310787 AW262580 Hs.159040 ESTs 0.119 

330028 CH.16_p2 #671908 0.1 19 

5 325317 CH.11_hs gi|5866878 0.119 

335279 CH22 FGENES.523 7 0.119 

331720 AA192173 Hs.221530 ESTs 0.119 

329186 CH.X hsgi|5868711 0.119 

316012 AA764950 Hs.1 19898 ESTs 0.119 

10 338316 CH22 EM:AC005500.GENSCAN.304-2 0.119 

326033 CH.17 hsgil5867178 0.119 

334745 CH22 FGENES.426 3 0.119 

333051 CH22 _FGENES.73_5 0.1 1 9 

301763 R01279 EST cluster (not m UniGene) with exon hit 0.12 

1 5 304502 AA454809 Hs.172928 collagen; type I; alpha 1 0.12 

335680 CH22_FGENES.594_5 0.12 

304678 AA543556 EST singleton (not in UniGene) with exon hit 0.12 

335441 CH22_FGENES.560_4 0.12 

336187 .CH22_FGENES.717_11 0.12 

20 309422 AW087175 EST singleton (not In UniGene) with exon hit 0.12 

336047 CH22_FGENES.679_9 0.12 

309651 AW195850 EST singleton (not in UniGene) with exon hit 0.12 

308547 AI695385 Hs.201903 EST 0.12 

304443 AA399444 EST singleton (not in UniGene) with exon hit 0.12 

25 336245 CH22_FGENES.746_3 0.12 

302703 H72333 EST cluster (not in UniGene) with exon hit 0.12 

335690 CH22_FGENES.596_5 0.12 

328941 CH.08_hsgi|6456765 0.12 

333873 CH22_FGENES.291_9 0.12 

30 317246 AW105092 Hs.155690 ESTs 0.12 

339288 CH22_BA354H2.GENSCAN.16-6 0.12 

337996 CH22 EM:ACO05500.GENSCAN.116-3 0.12 

333304 CH22_FGENES.137_1 0.121 

308332 AI591235 EST singleton (not In UniGene) with exon hit 0.121 

35 329319 CKX_hsgi|6381976 0.121 

302086 X57138 multiple UniGene matches 0.121 

333290 CH22_FGENES.129_2 0.121 

323825 AI793080 Hs.123525 ESTs; Weakly similar 10 NEUTROPHIL GELATINASE-ASSOCIATED 

LIPOCALIN PRECURSOR [Rjiorvegicus] 0.121 

40 330575 U64105 Hs£52280 Rho guanine nucleotide exchange factor (GEF) 1 0.121 

305274 AA679990 Hs.181 165 eukaryotic translation elongation factor 1 alpha 1 0.121 

333647 CH22_FGENES.235.2 0.121 

302251 AA333340 EST cluster (not in UniGene) with exon hit 0.121 

329777 CH.14_p2gil6002090 0.121 

45 333155 CH22_FGENES.89_5 0.121 

326122 CH.17_hsgil5867194 0.121 

335310 CH22.FGENES532J 0.121 

335453 CH22.FGENES.562J3 0.122 

305103 AA643329 Hs.1 11334 ferritin; Cght polypeptide 0.122 

50 337284 CH22_FGENES.667-2 0.122 

337418 CH22.FGENES.758-4 0.122 

313073 AI963740 Hs.46826 ESTs 0.122 

303759 AW504164 EST cluster (not in UniGene) with exon hit * 0.122 

300017 

55 M33197 AFFX control: GAPDH 0.122 

316725 AW135084 Hs.127264 ESTs 0.122 

330738 AA293153 Hs.120980 nuclear receptor ccwepressor 2 0.122 

336466 CH22_FGENES.829_25 0.122 

335956 CH22_FGENES.647_3 0.122 

60 315308 AA780564 Hs.189053 ESTs 0.122 

338925 CH22_DJ32l10.GENSCAN.14-3. 0.122 

334969 CH22_FGENES.466_2 0.122 

322050 AL1 37589 EST cluster (not in UniGene) 0.122 

339084 CH22 DA59H18.GENSCAN.38-2 0.122 

65 338323 CH22_EM:AC005500.GENSCAN.306-2 0.122 

337003 CH22 FGENES.419-7 0.122 

325470 CH.12J\s Q i|6017G34 0.123 

336503 CH22.FGENES.833J0 0.123 

330786 060374 Hs-58712 EST 0.123 
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329446 CH.Y_.hs gi|5868886 0.123 

303326 AA229433 Hsj222634 ESTs; Moderately amflaf to ubiquitin-Oke protein / 

rfcosomal protein S30 0.123 

309067 AI916313 Hs.212788 EST 0.123 

5 317464 AA968472 Hs.130463 ESTs ' 0.123 

328755 CH.07_hsgi|5868301 0.123 

326036 CH.17Jisgi|5867178 0.123 

327208 CH.01 hsgi|5867447 0.123 

326124 CH.17_hsgi|5916395 0.123 

10 327509 CH.Q2_hsgi|6117815 0.123 

338398 CH22_EM:AC005500.GENSCAN.336-5 0.123 

304652 AA527782 Hs.84298 CD74 antigen (invariant polypeptide of major 

histocompatibility complex; class II antigen-associated) 0. 1 23 

335797 CH22 FGENES.612 6 0.1.24 

15 336714 CH22_FGENES.76-29 0.124 

327204 CH.01_hsgil5867447 0.124 

331881 AA430672 Hs.123778 ESTs 0.124 

306971 AI126509 EST singleton (not in UniGene) with exon hit 0.124 

336174 CH22 FGENES.710 1 0.124 

20 336126 CH22 FGENES.701 13 0.124 

329129 CHJUisgi|6588026 0.124 

303049 AW407562 EST cluster (not In UniGene) with exon hit 0.124 

335778 CH22_FGENES.607_14 0.124 ~ 

336601 CH22_FGENES.369_2 0.124 

25 334340 CH22 FGENES.375J7 0.124 

337436 CH22_FGENES.76M 0.124 

306013 AA896990 EST singleton (not in UniGene) with exon hit 0.124 

339213 CH22_FF113D11.GENSCAN.6-8 0.124 

335355 CH22 FGENES.541_2 0.124 

30 336552 CH22_FGENES.841_9 0.124 

336384 CH22_FGENES.822_4 0.124 
310485 AI286202 Hs.149800 ESTs 0.125 
335840 CH22 FGENES.622.3 0.125 
336444 CH22_FGENES.827_10 0.125 

35 315703 N36070 EST cluster (not in UniGene) 0.125 

327763 CH.05 hsgi|5867961 0.125 

336383 CH22 FGENES.822_3 0.125 

333496 CH22.FGENES.168J6 0.125 

328662 CH.07_hsgiI6004473 0.125 

40 338986 CH22_DA59H18.GENSCAN.5-1 0.125 

328311 CH.07Jsgi|586B371 0.125 

337241 CH22.FGENES.644-2 0.125 

336933 CH22_FGENES.350-7 0.125 

313483 AW294432 Hs.144252 ESTs 0.125 

45 326116 CH.17_hsgi[5867193 0.125 

330450 HG363-HT363 Epidermal Growth Factor Receptor-Related Protein 0.125 

307491 AI268539 EST singleton (not in UniGene) with exon hit 0.125 

331852 AA418988 Hs.98314 Homo sapiens mRNA;cONAOKFZp586L01 20 

(from clone DKFZp586L0120) 0.125 

50 330462 HG944-HT944 Dopamine Receptor D4 0.125 

304410 AA284508 EST singleton (not in UniGene) with exon hit 0.125 

336385 CH22_FGENES.822_5 0.125 
336793 CH22.FGENES.176-3 - 0.125 
326243 CH.17_hsgi[5867261 0.125 

55 327266 CHD1_hsgi|5867462 0.125 

320753 AF070579 Hs.181544 Homo sapiens clone 24487 mRNA sequence 0.125 

336960 CH22_FGENES.369-5 0.125 

329667 CH.14_p2giI6272129 0.125 

328168 CH.06_hsgi(5868071 0.125 

60 336534 CH22 FGENES.839J6 0.125 

339289 CH22_BA354M2.GENSCAN.16-9 0.126 

309230 AI970747 EST singleton (not in UniGene) with exon hit 0.126 

339190 CH22_FF113D11.GENSCAN.1-2 0.126 

337086 CH22.FGENES.458-14 0.126 

65 319233 R21054 Hs-211522 ESTs 0.126 

339396 CH22_BA232E17.GENSCAN.fr8 0.126 

331930 AA449077 Hs.179765 Homo sapiens mRNA; cON A DKFZp586H 1921 

(from done DKFZp586H192 0.126 

308099 AI475914 EST singleton (not in UniGene) with exon hit 0.126 
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338477 CH22_EW:AC005500.GENSCAN.373-5 0.126 

334286 CH22J=G ENES.369J 6 0.126 

317245 AI025039 Hs.131732 ESTs 0.126 

335249 CH22_FGENES.516J0 0.126 

5 333327 CH22_FGEMES.138_20 0.126 

304240 AA009802 EST singleton (not in UniGene) with exon hit 0.126 

335464 CH22 FGENES.562 26 0.126 

335236 CH22 FGENES.515_8 0.126 

334154 CH22 FGENES.340.4 0.126 

10 309257 AI984183 EST singleton (not in UniGene) with exon hit 0.126 

310015 A1220122 Hs^01981 ESTs; Weakly similar to breast carcinoma-associated antigen 

[Ksapiens] 0.126 

328280 CH.07_hsgi|5868352 0.126 

305744 AA631819 EST singleton (not in UniGene) with exon hit 0.126 

15 327430 CH.G2Jisgi|5867754 0.126 

328323 CH.07_hsgi|5868373 0.126 

333274 CH22_FGENES.123_2 0.126 

337193 CH22 FGENES.575^3 0.127 

334820 CH22_FGENES.437_2 0.127 

20 328706 CH.07_hsgi|5868270 0.127 

331228 W67267 Hs.174911 ESTs 0.127 

307205 AI192479 EST singleton (not in UniGene) with exon hit 0.127 

337123 CH22_FGENES.519-3 0.127 - 

326201 CH.17Jisgi|5867216 0.127 

25 335276 CH22_FGENES.523_2 0.127 

331202 T81115 Hs.191136 ESTs 0.127 

330532 U03187 Hs.121544 interieukin 12 receptor; beta 1 0.127 

321235 N49521 EST duster (not in UniGene) 0.127 

301743 F12605 Hs.204529 ESTs; Weakly similar to reverse transcriptase [H. sapiens] 0.127 

30 328175 CH.06Jisgi|5868073 0.127 

306407 AA971985 EST singleton (not in UniGene) with exon hit 0.127 

327145 CH.01_hsgi|5867548 0.127 

327649 CH.04_hsgi[5867899 0.127 

335142 CH22_FGENES.498J2 0.127 

35 333909 CH22_FGENES.295_2 0.127 

330608 X04325 Hs.2679 gap Junction protein; beta 1 ; 32kD {connexin 32; 

Charcot-Marie-Tooth neuropathy; X-Bnked) 0.127 

330158 CH21_p2gi|6580367 0.127 

320153 AF064594 Hs.120360 phosphotipase A2; group VI 0.127 

40 314407 AA098835 Hs£24432 ESTs 0.127 

333383 CH22 FGENES.143 22 0.127 

320663 AI734242 Hs.244473 ESTs 0.128 

326233 CH.17_hsgi|5867232 0.128 

326598 CH20_hsgi|5867634 0.128 

45 335174 CH22_FGENES.504_4 0.128 

319843 H29920 Hs.99486 ESTs; Weakly simQar to araiarl [H.sapiens] 0.128 

335458 CH22.FGENES.562J8 0.128 

332997 CH22_.FGENES.58_4 0.128 

334188 CH22.FGENES.352J3 0.128 

50 329759 CH.14j>2gi[6048280 0.128 

330348 CH.09_p2gi|4544475 0.128 

326958 CH.21_hsgi|6469836 0.128 

305263 AA679467 EST singleton (not in UniGene) with exon hit - 0.128 

337693 CH22.EMAC000097.GENSCAN.78-14 0.128 

55 326812 CH20_hsgi|6682504 0.128 

333237 CH22_FGENES.108_7 0.128 

333699 CH22_FGENES250_13 0.128 

311496 AI768677 Hs.209888 ESTs; WeaWy similar to phosphatidylserine 

synthase-2 [Mjnuscuhjsj 0.128 

60 336499 CH22_FGENES.833_4 0.128 

320087 AF032387 HsJ 13265 small nxlear RNA activating complex; polypeptide 4; 190kD 0.128 

309969 AI184186 Hs.197813 ESTs 0.128 

301490 AW298468 Hs^50461 ESTs . 0.128 

337011 CH22_FGENES.427-6 0.128 

65 315052 AA876910 Hs.134427 ESTs 0.128 

301611 W22172 Hs£9038 ESTs 0.128 

336497 CH22_FGENES.833_2 0.129 

302068 Y16280 Hs.132049 endothetin type b receptor-like protein 2 0.129 

334502 CH22_FGENES397_18 0.129 
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314121 
337593 
332881 
305836 
339059 
305610 
305852 
327409 
312751 
308726 
325961 
311159 
322715 
336441 



45 326904 



337825 
325257 
331188 
330645 
331760 
322995 
335497 
334824 
319480 
334842 
333335 
317252 
329034 
305186 
335755 



AA158884 EST singleton (not In UniGene) with exon hit 0.129 

AA465405 EST singleton (not in UniGene) with exon hit 0.129 

R46180 Hs.153485 ESTs 0.129 

AI685841 Hs.161354 ESTs 0.129 

AF142579 EST duster (not in UniGene) with exon hit 0.129 

AI985821 Hs.62954 ferritin; heavy polypeptide 1 0.129 
H42142 Hs.226396 DEAD/H (Asp-Glu-Ab-Asp/His) box polypeptide 19 

(Dbp5; yeast; homolog) 0.129 

CH22 FGENES.3614 0.129 

CH.19_p2gi|6015202 0.129 

CH22_FF1 13D1 1 .GENSCAN.6-7 0.129 

CH.21_hsgi|60O4446 0.129 

AA662939 EST singleton (not in UniGene) with exon hit 0.129 

AI559492 EST singleton (not in UniGene) with exon hit 0.129 

CH22_FGENES.537-5 0.13 

U29112 EST cluster (not In UniGene) 0.13 

AA515554 Hs.1 19598 ribosomal protein L3 0.13 

AA745289 Hs.173088 ESTs 0.13 

CH22J)A59H18.GENSCANJ20* 0.13 

CH.19_p2gi|6O15202 0.13 

CH22_FGENES.138J 0.13 

CH22_EMAC005500.GENSCAN.12M 0.13 

AA232134 Hs.190028 ESTs 0.13 

AI239845 Hs.128494 ESTs;WeaWysin-artoEG:95B72[Djnelanogaster] 0.13 

CH22_EM:AC005500.GEN SCAN .398-1 1 0.13 

CH22_FGENES.652_1 0.13 

ESTs 0.13 

CH22 C20H12.GENSCAN.6-8 0.13 

CH22_FGENES.33J 0.13 

EST singleton (not In UniGene) with exon hit ^ 0.1 3 

CH22_DA59H18.GENSCAN.30-5 0.13 

EST singleton (not in UniGene) with exon hit 0.13 

EST singleton (not in UniGene) with exon hit 0.13 

CH.02_hsgi|5867750 0.13 

Hs.164178 ESTs 0.13 

Hs.209929 EST 0.13 

CH.16_hsgi|5867147 0.13 

Hs.197636 ESTs 0.13 

Hs.182135 ESTs 0.13 

CH22_FGENES.827_7 0.13 

CH22_FGENES.814_12 0.13 

EST singleton (not In UniGene) with exon hi! 0.13 

CH22 FGENES.217_8 0.13 

CH2^EMAC005500.GENSCAN.384-17 0.131 

CH.21_h$gi|5867684 0.131 

CH22 FGENES.717-1 0.131 

CH.2u_hsgi|5867615 0.131 

EST singleton (not in UniGene) with exon hit 0.131 

EST duster (not in UniGene) with exon hit 0.131 

CH22_EIAAC005500.GENSCAN^59-22 0.131 

CH22_FGENES.272_5 0.131 

CH22.FGENES.54_8 0.131 

CH22_FGENES341J2 - 0.131 

CH22_FGENES.635_4 0.131 

CH22_EMAC005500.GENSCAN.13-19 0.131 

CH.11_hsgi|5866895 0.131 

Hs.167837 ESTs 0.131 

Hs. 144879 dual specificity phosphatase 9 0.131 

Hs.154434 ESTs; Weakly similar to unknown (H^aplens] 0.131 

Hs.29797 ribosomal protein L10 0.131 

CH22_FGENES.571j5 0.131 

CH22_FGENES.437_6 0.131 

R06933 Hs.184221 ESTs 0.131 

CH22_FGENES.439_21 0.131 

CH22.FGENES.139J 0.131 

AA905178 Hs.130124 ESTs 0.131 

CHX_hsgi|5868561 0.131 

AA664230 EST singleton (not in UniGene) with exon hit 0.131 

CH22_FGENES.604_4 0.131 
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302143 H15270 Hs.169847 putative neuronal cell adhesion molecule 0.131 

334939 CH22_FGENES.465_3 0.131 

318994 C15110 Hs.17802 ESTs 0.131 

334498 CH22_FGENES.397_14 0.131 

5 333413 CH22_FGENES.146_2 0.132 

329676 CH.14_p2 gi|6272128 0.132 

327277 CH.01_hsgi)5867473 0.132 

305022 AA627416 EST singleton (not in UniGene) with exon hit 0.132 

336805 CH22 FGENES.196-3 0.132 

10 320121 T93557 EST duster (not in UniGene) 0.132 

334761 CH22_FGENES.428J0 0.132 

339400 CH22_BA232E17.GENSCAN.7^6 0.132 

330301 CH.06_p2gi|2905862 0.132 

316822 AA827691 Hs.129967 ESTs; Weakly similar to neuronal thread protein 

15 AD7oNTP (H^apiens] 0.132 

328020 CH.06_hsgi|5902482 0.132 

325327 CH.11_hsgi|5866875 0.132 

321163 AA209530 EST duster (not In UniGene) 0.132 

336393 CH22_FGENES.823_5 ' 0.132 

20 325905 CH.16_hsgi|5867104 0.132 

305237 AA676286 Hs.2186 e ukaryotic translation elongation factor 1 gamma 0.132 

339046 CH22_DA59H18.GENSCAN.2B-6 0.132 

325375 CH.12_hsgi|5866920 0.132 ~ 

333961 CH22_FGENES.304_7 0.132 

25 335450 CH22_FGENES.562_8 0.133 

302286 R58438 EST duster (not in UniGene) with exon hit 0.133 

335116 CH22_FGENES.496_3 0.133 

327333 CH.01_hsgi|5902477 0.133 

308070 AI470948 EST singleton (not in UniGene) with exon hit 0.133 

30 308311 AI581855 EST singleton (not in UniGene) with exon hit 0.133 

320813 AW360847 Hs£08839 ESTs 0.133 

323665 AW248307 EST duster (not in UniGene) 0.133 

328318 CH.07J>sgi|5868373 0.133 

320603 R51419 EST duster (not in UniGene) 0.133 

35 332791 CH22.FGENES.3_1 0.133 

314976 AA524725 Hs.162108 ESTs 0.133 

303309 AL134164 Hs.224868 ESTs 0.133 

320581 R39753 Hs.170187 ESTs 0.133 

333944 CH22_FGENES.3Q2_2 0.133 

40 317992 AI733512 Hs.130901 ESTs 0.133 

330935 F02383 Hs.26492 beta-1;3^luarony!transferase 3 (glucuronosyitransferase I) 0.133 

336659 CH22.FGENES.36-5 0.133 

338887 CH22_DJ32I10.GENSCAN.6-10 0.133 

305273 AA679979 Hs.181165 e ukaryotic translation elongation fador 1 afcha 1 0.133 

45 333566 CH22_FGENES.183_2 0.134 

316952 AW450033 Hs.163312 ESTs 0.134 

333818 CH22.FGENES.283J 0.134 

328687 CH.07__hsgi|586B262 0.134 

302879 H11602 EST duster (not in UniGene) with exon hit 0.134 

50 336557 CH22_FGENES.B42_2 0.134 

335222 CH22_FGENES.513_5 0.134 

338094 CH22_EMlAC005500.GENSCAN.179-3 0.134 

337384 CH22_FGENES.745-1 • 0.134 

327360 CH.01_hsgi|6552411 0.134 

55 328132 CH.06_hsgi|5868038 0.134 

323604 AI751438 Hs.182827 ESTs; Weakly similar to III! ALU SUBFAMILY SQ 

WARNING ENTRY Ml! 0.134 

337591 CH22_C20H12.GENSCAN.^6 0.134 

307018 A1140639 EST singleton (not in UniGene) with exon hit 0.134 

60 326896 CH.21_hsgi|5867680 0.134 

333479 CH22_FGENES.163_5 0.134 

337915 CH22_EMAC005500.GENSCAN.61-3 0.134 

335110 CH22_FGENES.494_18 0.134 

333481 CH22_FGENES.163_9 0.134 

65 327512 CH.02_hsgi|61 17815 0.134 

300096 AW328639 Hs.83575 ESTs; WeaWy similar to 2C328 .3 [C.elegansJ 0.134 

330163 CH.02_p2 gi[6042042 0.135 

335752 CH22_FGENES.604_1 0.135 

334857 CH22J=GENES.443_1 0.135 
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301872 H84730 EST cluster (not in UniGene) with exon hit 0.135 

337529 CH22.FGENES.823-29 0.135 

335734 CH22_FGENES.601_4 0.135 

337551 CH22.FGENES.847-8 0.135 

5 303076 AI920965 Hs.77961 major histocompatibility complex; class t; B 0.135 

335513 CH22_FGENES.571.28 0.135 

339078 CH22_DA59H18.GENSCAN.37-6 0.135 

321907 N56660 Hs.148722 ESTs; Weakly similar to large tumor suppressor 1 (H.saplensj 0.135 

337189 CH22_FGENES.571-32 0.135 

10 329635 CH.12j>2gi]5302817 0.135 

308601 AI719930 EST singleton (not in UniGene) with exon hit 0.135 

305020 AA627248 Hs.2064 vimentin 0.135 

. 333894 CH22_FGENES.293_1 0.135 

322465 AA137152 H^3784 ESTs; Highly similar to phosphoserine aminotransferase 

15 [H.sapiens] 0.135 

305601 AA780975 EST singleton (not in UniGene) with exon hit 0.135 

332186 H10781 Hs.141051 ESTs; Moderately similar to U!! ALU SUBFAMILY SB 

WARNING ENTRY 0.135 

327822 CH.05_hsgi|5867968 0.135 

20 310087 AI393914 Hs.160624 ESTs; WeaWy similar to similar to CR 16; SH3 domain 

binding protein 0.135 

328752 CH.07_hsgi|5868298 0.135 . 

337611 CH22_C20Hl2.GENSCAN.19-4 0.135 ~~ 

334470 CH22_FGENES.394_1 0.136 

25 335115 CH22_FGENES.496_2 0.136 

328730 CH.07_hsgi|5868289 0.136 

330350 CH.09_p2 gi|3056622 0.136 

336971 CH22_FGENES.378-6 0.136 

308258 AI565612 EST singleton (not in UniGene) with exon hit 0.136 

30 326745 CH20_hsgi|5857611 0.136 

335440 CH22_FGENES.560_3 0.136 

320257 AA330746 EST cluster (not in UniGene) 0.136 

328677 CH.07_hsgi|5868256 0.136 

329731 CH.14_p2gi|6065783 0.136 

35 315950 AA700553 Hs.206974 ESTs 0.136 

330049 CH.17_p2gi|4567182 0.136 

337070 CH22_FGENES.448-3 0.136 

304095 H11324 Hs.31059 EST 0.136 

309304 AW005527 Hs.232820 EST 0.136 

40 333458 CH22_FGENES.157_7 0.136 

329899 CH.15_p2gi|6563505 0.136 

322202 AI275056 Hs.200133 ESTs 0.136 

333991 CH22_FGENES.310_15 0.136 

318617 AW247252 Hs.75514 nucleoside phosphorylase 0.136 

45 310623 AI341586 Hs.195588 ESTs 0.136 

330489 M23323 Hs.3003 CD3E antigen; epsflon polypeptide (TTT3 complex) 0.136 

309646 AW194694 EST singleton (not in UniGene) with exon hit 0.136 

331068 R00071 Hs.191199 ESTs 0.136 

334285 CH22_FGENES.369_15 0.136 

50 332178 F13689 Hs.100725 EST 0.136 

305724 AA827608 EST singleton (not In UniGene) with exon hit 0.136 

303158 AL138110 Hs.B594 Homo sapiens mRNA containing (CAG)4 repeat; clone CZ-CAG-7 0.136 

334543 CH22_FGENES.403_8 - 0.136 

335384 CH22_FGENES.543_26 0.136 

55 336527 CH22 FGENES.B39_8 0.136 

334951 CH22 FGENES.465.20 0.136 

325882 CH.16_hsgi|5867087 0.137 

305134 AA653159 EST singleton (not in UniGene) with exon hit 0.137 

307058 AI148709 EST singleton (not in UniGene) with exon hit 0.137 

60 331943 AA453418 Hs.178272 ESTs 0.137 

331116 R44780 Hs.22634 ESTs 0.137 

306094 AA908877 EST singleton (not in UniGene) with exon hit 0.137 

333561 CH22.FGENES.180J8 0.137 

321439 H61962 EST cluster (not in UniGene) 0.137 

65 324594 AA497090 EST cluster (not in UniGene) 0.137 

337926 CH22_EM:AC005500.GENSCAN.77-4 0.137 

337353 CH22_FGENES.726-1 0.137 

331836 AA412295 Hs.104774 EST 0.137 

308981 AI873242 EST singleton (not in UniGene) with exon hit 0.137 
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329424 CH.Y_hsgil586B879 0.137 

325829 CH.15_hsgi|5867052 0.137 

331845 AM16863 Hs.98183 ESTs 0.137 

333854 CH22_FGENES290_13 0.137 

5 306591 AI000248 EST singleton (not In UniGene) with exon hit 0.137 

328948 CH.08_hsgf]6456765 0.137 

338935 CH22JXI32I10.GENSCAN.18-12 0.137 

325960 CH.16_hs gi|5867147 0.137 

328377 CH.07_hsgi|5868390 0.138 

10 308851 AI829820 EST singleton (not in UniGene) with exon hit 0.138 

314620 AA424352 Hs.210586 ESTs 0.138 

337592 CH22_C20H12.GENSCAN.S-7 0.138 

338684 CH22_EM:AC0055Q0.GENSCAN.472-3 0.138 

331800 AA400498 Hs.97543 ESTs 0.138 

15 304587 AA505535 EST singleton (not in UniGene) with exon hit 0.138 

333981 CH22_FGENES.310_4 0.138 

332452 AA040369 Hs.11170 SYT interacting protein 0.138 

305752 AA835278 EST singleton (not in UniGene) with exon hit 0.138 

311947 T65554 Hs^51591 EST 0.138 

20 333783 CH22_FGENES^73_5 0.138 

337406 CH22_FGENES.754-14 0.138 
327976 CH.06_hsgi|5868212 . 0.138 . 

325593 CH.13_hsgi|5866992 0.138 " 

339425 CH22_DJ579N16.GENSCAN.14-4 0.138 

25 304475 AA428879 EST singteton (not in UniGene) with exon hit 0.138 

309488 AW131104 EST singleton (not in UniGene) with exon hit 0.138 

337532 CH22_FGENES.827-6 0.138 

317234 AA904448 Hs.126368 ESTs 0.138 

312261 AA854425 Hs.144455 ESTs 0.138 

30 328927 CH.08_hsgi]5868500 0.138 

336424 CH22_FGENES.824_9 0.138 

326667 CH.20_hsgf|6552455 0.138 

325988 CH.1 6_hs gi|5867064 0.138 

318446 AW300287 EST duster (not In UniGene) 0.139 

35 336511 CH22_FGENES.834_6 0.139 

335204 CH22.FGENES.508J3 0.139 

303244 AA147472 EST duster (not in UniGene) with exon hit 0.139 

330870 AA115804 Hs.187593 ESTs 0.139 

329376 CHJLhsgi|5868859 0.139 

40 304703 AA563898 EST singleton (not in UniGene) with exon hit 0.139 

333653 CH22_FGENES.239_2 0.139 

306799 A1Q51696 EST singlBton (not in UniGene) with exon hit 0.139 

304872 AA595289 EST singleton (not in UniGene) with exon hit 0.139 

330812 AA013001 Hs.60563 ESTs 0.139 

45 329568 CH.10_p2gi|3962490 0.139 

319210 AA253074 Hs.146261 ESTs 0.139 

334320 CH22_FGENES.374_5 0.139 

300860 A1916949 Hs.149748 ESTs; Weakly similar to weak similarity to collagens [Celegans] 0.139 

305866 AA864533 EST singteton (not in UniGene) with exon hit 0.139 

50 312943 AA984364 Hs.119064 ESTs 0.139 

330523 M99439 Hs.83958 transdudn-like enhancer of split 4; homotog of Drosophila E(sp1) 0.139 

312708 AI076204 Hs.135440 ESTs 0.139 

309366 AW072970 EST singleton (not bi UniGene) with exon hit • 0.139 

303273 AA316069 EST duster (not in UniGene) with exon hit 0.139 

55 317484 AW274696 Hs.143921 ESTs 0.139 

333239 CH22_FGENES.111J 0.139 

307126 AI184951 EST singleton (not in UniGene) with exon hit 0.139 

316813 AAB26505 Hs.124517 ESTs 0.139 

331746 AA281365 Hs.121640 ESTs; Weakly similar to KIAA0386 [H.sapiens] 0.139 

60 308558 Af700145 Hs.172182 poly(AH)inding protein; cytoplasmic 1 0.139 

310784 AW086142 Hs.159017 ESTs 0.139 

323831 AA335715 Hs£00299 ESTs 0.139 

307692 AI318342 EST singleton (not in UniGene) with exon hit 0.139 

310570 AI318327 EST duster (not In UniGene) 0.139 

65 327934 CH.06_hsgi|5868l84 0.139 

305232 AA670O52 Hs.195188 glycera!dehyiie-3i}hospriatedehyayogenase 0.139 

334756 CH22_FGENES.428_5 0.139 

331938 AA451867 Hs.99255 ESTs 0.139 

301393 A1474722 Hs,150898 ESTs; Weakly similar to KIAA0644 protein (H.sapiens] 0.139 
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312005 T78450 Hs.13941 ESTs 0.139 

338431 CH22_EM:AC005500.GEN$CAN.351-4 0.14 

331214 T90496 Hs.16757 ESTs < 0.14 

333601 CH22_FGENES.213_4 0.14 

5 323481 AA278449 Hs.137429 ESTs 0.14 

336911 CH22.FGENES.344-4 0.14 

338157 CH22_EM:AC005500.GENSCAN.209.5 0.14 

327845 CH.05_hsgi[6531962 0.14 

319109 245662 Hs.90797 Homo sapiens done 23620 mRNA sequence 0.14 

10 334763 CH22_FGENES.428J2 0.14 

329384 CHJC^bs gi|5868869 0.14 

302996 AF054663 EST cluster (not In UniGene) with exon hit 0.14 

323751 AW452656 Hs.209824 ESTs 0.14 

329916 CH.16j)2gil6223624 0.14 

15 301993 N49826 Hs.18602 ESTs 0.14 

338129 CH22_EM:AC005500.GENSCAN.197-2 0.14 

325704 CH.14_hsgi]5867028 0.14 

335656 CH22.FGENES590J 0.14 

331673 W72366 Hs.40033 ESTs 0.14 

20 316807 AI018331 Hs.172444 ESTs; Highly similar to transection regulator [Mjnuscurus] 0.14 

310743 AW449754 Hs.158665 ESTs 0.14 

326941 CH.21_hsgi|6004446 0.14 

328809 CH.07_hsgi|5868327 0.14 

323855 AI653164 Hs.128665 ESTs 0.14 

25 304705 AA564064 EST singleton (not in UniGene) with exon hit 0.14 

325666 CH.14_hsgi|6469822 0.14 

333747 CH22_FGENES.265_6 0.14 

318287 AW015616 Hs.143321 ESTs 0.141 

332972 CH22_FGENES.51_5 0.141 

30 305704 AA825266 EST singleton (not in UniGene) with exon hit 0.141 

315699 AW182805 Hs.189183 ESTs; Weakly similar to Nodi (H.sapiens] 0.141 

327296 CH.01_hsgi|5867492 0.141 

336400 CH22 FGENES.823J5 0.141 

321033 H26214 Hs.20733 ESTs; Weakly similar to !!!! ALU SUBFAMILY SX 

35 WARNING ENTRY 0.141 

316522 AI475995 Hs.122910 ESTs 0.141 

335715 CH22_FGENES.599_15 0.141 

335959 CH22 FGENES.650J 0.141 

333259 CH22 FGENES.118_7 0.141 

40 337382 CH22_FGENES.744-8 0.141 

322346 AA227618 Hs.10882 HMG-box containing protein 1 0.141 

325378 CH.12_hsgi|5866920 0.141 

338500 CH22_EMAC005500.GENSCAN.390-1 0.141 

338460 CH22_ENtAC005500.GENSCAN.362-5 0.141 

45 315279 AW511138 H*256581 ESTs 0.141 

314439 AI539443 Hs.137447 ESTs 0.141 . 

333624 CH22_FGENES.222_3 0.141 

329237 CHXhsgi|5B68729 0.141 

330117 CR19_p2gi|6015201 0.141 

50 338017 CH22 EMAC005500.GENSCAN.134-1 0.141 

337854 CH22_EM:AC005500.GENSCAN.38-12 0.142 

329984 CH.16_p2gi|4646193 0.142 

305004 AA622328 Hs.1 62762 EST - 0.142 

302815 N40373 EST cluster (not in UniGene) with exon hit 0.142 

55 327823 CH.Q5_hsgi|5867968 0.142 

326753 CH50_hsgi|5867616 0.142 

301201 AA904482 Hs.197775 ESTs 0.142 

334303 CH22.FGENES.373 6 0.142 

326453 C8.19_hsgi]5867399 0.142 

60 311050 A1864581 Hs.215477 ESTs 0.142 

308740 AI802711 Hs.210337 EST; Weakly similar to aldolase A [fisapiens] 0.142 

331003 H63959 Hs.142722 ESTs 0.142 

338010 CH22_EMAC005500.GENSCAN.128-6 0.142 

336326 CH22_FGBIES.812_4 0.142 

65 318100 R44308 Hs542302 ESTs 0.142 

320641 R55421 EST duster (not In UniGene) 0.142 

325855 CH.16_hsgi|5867067 0.142 

330425 HG1728-HT1734 Non-Specific Cross Reacting Antigen (Gb:D90277), 

Alt Splice Form 2 0.142 
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324583 AA425411 Hs22581 ESTs 0.142 

326268 CH.17_hs gi|5867267 0.142 

331390 AA460341 Hs.45008 ESTs 0.142 

338904 CH22_DJ32I10.GENSCAN.10.16 0.143 

5 333096 CH22_FGENESJ9_1 0.143 

331919 AA446869 Hs.1 19316 ESTs 0.143 

312214 AI248004 Hs.125187 ESTs 0.143 

323198 AW179174 Hs.7984 ESTs 0.143 

316107 AI204001 Hs.184014 ribosomaJ protein L31 0.143 

10 301335 AA885317 Hs.190511 ESTs 0.143 

337392 CH22_FGENES.747-3 0.143 

325543 CH.12_hsgi|6682452 0.143 

305903 AA873085 EST singleton (not In UniGene) with exon hit 0.143 

332707 L35594 Hs. 174 185 phosphodiesterase l/nucteotkfe pyrophosphatase 2 (autotaxin) 0.143 

15 337913 CH22_EM:AC005500.GENSCAN.59-10 0.143 

301436 AA961061 Hs.131696 ESTs 0.143 

335078 CH22_FGENES.488_5 0.143 

338451 CH22_EM:AC005500.GENSCAN.359-39 0.143 

302777 AJ230640 EST duster (not in UniGene) with exon hit 0.143 

20 330464 J03068 Hs.78223 N-atytarninoacyt-peprjde hydrolase 0.143 

330988 H41411 Hs.33855 ESTs 0.143 

328939 CH.08jsgi|6004481 0.143 

308015 AI440174 Hs.228907 EST; Weakly similar to GUANINE NUCLEOTIDE-BINDING 
PROTBN BETA SUBUNIT-LIKE PROTEIN 

25 12.3[H.sapiens] 0.143 

328504 CH.07_hsgi|5868471 0.143 

332599 AA402891 Hs.32951 solute carrier family 29 (nucleoside transporters); member 2 0.143 

335744 CH22_FGENES.601J5 0.143 

322394 AF077208 EST cluster (not in UniGene) 0.143 

30 323892 AL042661 EST duster (not in UniGene) 0.143 

318443 AI939323 Hs.157714 ESTs; Weakly similar to NEURONAL ACETYLCHOLINE 
RECEPTOR PROTEIN; ALPHA-5 CHAIN PRECURSOR 

|H.sapiens] 0.143 

336568 CH22 FGENES.843_7 0.143 

35 330958 H08815 Hs.159824 EST 0.143 

327672 CH.04 hsgi|5867843 0.143 

335900 CH22 FGENES.635 8 0.144 

336044 CH22_FGENES.679_6 0.144 

318845 AI815951 Hs33183 ESTs; Weakly similar to estrogen-responsive Gnger protein; 

40 efp[H.sapiens] 0.144 

333483 CH22_FGENES.165J 0.144 

333337 CH22_FGENES.139_6 0.144 

305993 AA889197 EST singleton (not in UniGene) with exon hit 0.144 

335719 CH22_FGENES.599_22 0.144 

45 325682 CR14_hsgi|6138923 0.144 

327350 CH.01_hsgi|6249563 0.144 

339291 CH22 BA354I12.GENSCAN.18-1 0.144 

326358 CH.18 hsgt[5867293 0.144 

330316 CH.08j>2gi|6007576 0.144 

50 308150 AI499346 Hs.174131 ribosomal protein L6 0.144 

338065 CH22 EM:AC0Q5500.GENSCAN.164-1 0.144 

339009 CH22 DA59H18.GENSCAN.18-7 0.144 

327776 CH.05Ju>gi|5867964 * 0.145 

336664 CH22.FGENES.41-0 0.145 

55 321921 AFO70619 EST duster (not In UniGene) 0.145 

319346 T70147 Hs.12024 ESTs 0.145 

304265 AA062892 EST singleton (not in UniGene) with exon hit 0.145 

303818 Z45986 Hs.250178 oopinell 0.145 

327498 CH.02 hsgi|6017023 0.145 

60 335227 CH22 FGENES513J3 0.145 

339022 CH22J3A59H18.GENSCAN.22-1 0.145 

302597 H55661 Hs.33026 ESTs; Weakfy similar to similar to Enteroooocus faecalis 

TRAB [Celegans] 0.145 

308550 AI697008 Hs.201811 EST 0.145 

65 302175 AA262760 Hs. 15601 5 Homo sapiens chromosome 19; cosmid R29381 0.145 

303252 AA156760 EST duster (not in UniGene) with exon hit 0.145 

337414 CH22J=GENES.757-2 0.145 

310382 AI734009 EST duster (not in UniGene) 0.145 

329333 OtX_hsgi)5868806 0.145 
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336857 




CH22_FGENES.291-7 


0.145 


332565 


AA234896 


Hs.25272 El A binding protein p300 


0.145 


318634 


AI928098 


Hs.156832 ESTs 


0.145 


336318 




CH22_FGENES.801_1 


0.145 


310950 


AI923551 


Hs.170843 ESTs 


0.145 


335346 




CH22_FGENES.537_2 


0.145 


331196 


T65416 


Hs.12826 ESTs 


0.145 


337607 




CH22_C20H12.GENSCAN.17-3 


0.146 


331206 


T84096 


Hs.15284 ESTs 


0.146 


301793 


T80698 


EST duster (not in UniGene) with exon hit 


0.146 


319590 


AA210878 


EST cluster (not In UniGene) 


0.146 


311394 


A1695374 


Hs.256231 ESTs 


0.146 


324773 


AA632554 


Hs.163401 ESTs 


0.146 


324841 


Al 142359 


Hs.155316 ESTs 


0.146 


332260 


N70088 


1 |_ 4AAJ/*"* ^ A 

Hs.1 38467 ESTs 


0.146 


329276 




CH.X_hs gi|5868762 


0.146 


335887 




CH22_FGENES.633_1 


0.146 


338294 




CH22_EM:AC005500.GENSCAN .297-1 


0.146 


336993 




CH22_FGENES.409-4 


0.146 


334135 




CH22_FGENES.336_2 


0.146 


326251 




CH.17Jisgi|5867263 


0.146 


337396 




CHB2_FGENES.749-1 


0.146 


339167 




CH22_DA59H18.GENSCAN.69-8 


0.146 


316838 


AW135418 


Hs.161210 ESTs 


0.146 


325313 




CH.11_hsgi|5866865 


0.146 


331047 


N66918 


Hs.32205 ESTs 


0.146 


323915 


AL043362 


EST cluster (not in UniGene) 


0.146 


302747 


AF062275 


EST cluster (not in UniGene) with exon hit 


0.146 


306317 


AA947909 


EST singleton (not in UniGene) with exon hit 


0.146 


334399 




CH22_FGENES.382_5 


0.146 


326472 




CH.19_hsgip887404 


0.146 


333061 




CH22_FGENES.75..4 


0.146 


337072 




CH22_FGENES.448-5 


0.146 


334328 




CH22_FGENES.375_5 


0.146 


327039 




CH.21_hsgi|6531965 


0.146 


325576 




CH.12_hsgi|6552443 


0.147 


315935 


A1075804 


Hs.132660 ESTs 


0.147 


319638 


AA323758 


EST cluster (not in UniGene) 


0.147 


334501 




CH22_FGENES.397J7 


0,147 


338238 




CH22_EM:AC0Q5500.GENSCAN.2644 


0.147 


308636 


AI744063 


EST singleton (not in UniGene) with exon hit 


0.147 


338567 




CH22_FGENES.843_6 


0.147 


335819 




CH22_FGENES.619_2 


0.147 


336950 




CH22_FGENES.361-8 


0.147 


307055 


AI148477 


EST singleton (not in UniGene) wfih exon hit 


0.147 


315134 


AW504854 


Hs.126714 ESTs 


0.147 


335834 




CH22_FGENES.621J 


0.147 


327870 




CH.06_hsgil5868131 


0.147 


323802 


AA332011 


Hs.250138 protein phosphatase 2C; magnesium-dependent; catalytic subunit 


0.147 


329412 




CHX_hsgi|6682553 


0.147 


323791 


AA333068 


EST cluster (not in UniGene) 


0.147 


324126 


AA385315 


EST duster (not in UniGene) 


0.147 


327865 




CH.06 hsgi|5868130 


0.147 


333445 




CH22_FGENES.154_2 


0.147 


321302 


M021351 


Hs.158497 KIAA0724 gene product 


0.147 


336744 




CH22_FGENES.118-9 


0.147 


323731 


AA323414 


EST duster (not in UniGene) 


0.148 


320289 


H07989 


EST duster (not in UniGene) 


0.148 


305488 


AA749000 


EST singleton (not in UniGene) with exon hit 


0.148 


305592 


AA780594 


Hs.62954 ferritin; heavy polypeptide 1 


0.148 


304094 


H11295 


EST singleton (not in UniGene) with exon hit 


0.148 


325040 


AW296368 


EST cluster (not in UniGene) 


0.148 


339034 




CH22_DA59H18.GENSCAN.26-2 


0.148 


334504 




CH22_FGENES.398_2 


0.148 


334778 




CH22.FGENES.43U 


0.148 


320148 


U77494 


Hs.1 19687 RAN binding protein 8 


0.148 


303584 


AW173759 


Hs.203401 ESTs 


0.148 


325826 




CH.15_hsgil5867048 


0.148 


331192 


T55182 


Hs.1 52571 ESTs; Highly similar to IGF-II mRNA-blnding protein 2 [H.sapiens] 0.148 
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325785 CH.14_hsgi]6381957 0.148 

333166 CH22.FGENES.91_8 0.148 

336548 CH22_FGENES.841J> 0.148 

337552 CH22 C4G1.GENSCAN.1-4 0.148 

5 331775 AA382742 Hs.97151 EST 0.148 

338936 CH22_DJ32l10.GENSCAN.19-6 0.148 

331869 AA428554 Hs.104894 ESTs; Weakly stouter to fibranectin precursor [H^apiens] 0.148 

332865 CH22_FGENES.28_5 0.148 

328663 CH.07Jisgi|6004473 0.148 

10 328438 CH.07_hsgi|5868417 0.148 

311158 AI634864 Hs.250789 ESTs; Highly similar to similar to NEDD-4 [H.sapiens] 0.148 

336942 CH22_FGENES.354-2 0.148 

302262 R53169 Hs.246091 ESTs 0.149 

333296 CH22_FGENES.132_3 0.149 

15 333365 CH22_FGENES.142_2 0.149 

311706 AW452392 Hs.252854 ESTs . 0.149 

337109 CH22.FGENES.489-2 0.149 

315062 AW173300 Hs.190201 ESTs 0.149 

333454 CH22_FGENES.157_3 0.149 

20 334784 CH22_FGENES.432_9 0.149 

333255 CH22 FGENES.118_3 0.149 

337518 CH22_FGENES.814-7 0.149 = 

320651 AA489268 EST duster (not in UniGene) 0.149 *~ 

323437 AA287567 EST cluster (not in UniGene) 0.149 

25 328761 CH.07Jisgi|5868302 0.149 

328787 CH.07_hsgi|5868309 0.149 

335261 CH22_FGENES.520_2 0.149 

300827 R16689 Hs.106004 ESTs 0.149 

339263 CH22_BA354M2.GENSCAN.10-1 0.149 

30 337412 CH22_FGENES.756-6 0.149 

334414 CH22_FGENES.384J 0.149 

332931 CH22_FGENES.38_5 0.149 

310801 AW270980 Hs.106346 novel centrosomal protein RanBPM 0.149 

305216 AA669056 EST singleton (not in UniGene) with exon hit 0.149 

35 314779 AM70122 Hs.190261 ESTs 0.149 

338414 CH22_EM:AC005500.GENSCAN.341-27 0.149 

303342 AW247361 EST duster (not in UniGene) with exon hit 0.149 

337509 CH22.FGENES.806-4 0.149 

306631 AI001149 EST singleton (not In UniGene) with exon hit 0.149 

40 302533 L36149 Hs2481 16 chemokine (C motif) XC receptor 1 0.149 

336536 CH22_FGENES.839J8 0.149 

324666 T32458 Hs.14285 ESTs 0.149 

310173 AI767433 Hs.170013 ESTs 0.149 

333595 CH22 FGENES.211_2 0.149 

45 335975 CH22_FGENES.652_9 0.15 

306654 AI003654 EST singleton (not in UniGene) with exon hit 0.15 

335025 CH22_FGENES.475_3 0.15 

32871 1 CH.07_hs gi|5868271 0.15 

328274 CH.07_hsgi|5868219 0.15 

50 325505 CH.12_hsgi|6682451 0.15 

329641 CH.14_p2gi|6468233 0.15 

304955 AA613504 EST singleton (not In UniGene) with exon hit 0.15 

339103 CH22_DA59H18.GENSCAN.44-10 * 0.15 

329636 CH.12_p2gil5302817 0.15 

55 310118 AI203293 Hs.157489 ESTs 0.15 

326056 CH.17_hsgi[5867184 0.15 

303773 AA769074 EST duster (not in UniGene) with exon hit 0.15 

303153 U09759 Hs.8325 mitogen-activated protein kinase 9 0.15 
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TABLE 13A shows the accession numbers for those primekeys lacking unigeneDD's for 
Table 13. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



322050 24275.1 
321439 1599424.1 
321666 13653.22 



300088 622937.1 
322303 704603.1 
322394 27492.1 



321758 44275.1 
323109 155498 J 



322533 38937.1 
321921 34680.1 
321927 21620J 



321932 265316.1 
306971 14694.7 



AL137589 AA423949 BE222949 BE222694 AI1 99615 AW8731 16 AI277950 AW044290 AW630096 
H61962 W01567 N75711 
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BE312230 BE407843 BE253884 BE407645 BE615804 BE619058 BE559512 BE383249 BE613497 BE294351 BE295062 
BE622385 BE390654 BE535438 BE563186 BE396374 BE270842 BE3861 10 BE260368 BE250186 BE265875 BE537229 
BE253369 BE256997 BE269482 BE264959 BE279072 AA662160 BE280733 AAS58428 BE561308 BE267285 BE561422 
BE563181 BE304614 BE295437 BE619424 BE275863 BE394315 BE4O8109 BE541866 BE253772 BE618236 BE535261 
BE296490 BE278212 BE563154 BE257245 BE262274 BE513032 BE378567 BE394152 BE616947 BE269302 BE546516 
BE536792 BE615187 BE261186 BE615367 BE619289 BE261 184 T49376 AL031671 BE273400 BE563457 BE545597 
BE615169 AA150323 AA1 58723 AA079033 BE313333 AA160100 BE271115 BE294302 BE273051 BE273048 BE622390 
AA837947 BE387721 AW973277 AA808731 BE280792 AA160444 BE256723 AI745420 AA643017 BE549441 BE293858 
AW975249 AI620819 AW089494 AI434549 BE305231 AA081262 BE280101 AA522507 AI950880 AA187460 BE386860 
AW859229 8E170439 BE620149 BE548218 AA316696 AA484426 AJ567740 AA 160605 AW939805 AA089573 BE300194 
BE391331 AW975419 H26808 BE545544 BE615974 AW800241 BE616222 W17343 BE387865 T53697 C03943 BE617637 
BE315130 T52942 T50588 N74693 AA187107 T59919 AW797397 AA206447 AA854619 T57175 A1570296 AW517964 
AA158269 AI282220 W25297 AI580710 BE262453 A1185868 AA526485 AI288051 AI582513 AA100675 AW615567 
BE395354 AI472725 BE314881 BE621281 N99921 AI282689 AI432725 AW73201 1 AA872254 BE205807 T59435 AI282712 
AA650505 AI004374 AA725260 BE313161 T60173 AI371260 BE385641 AW751812 AA078827 AI491858 AI433822 
AA2 191 18 AI002092 AAS96003 AA064604 AI250287 AI304397 AI45321 3 AA653630 AI524573 AI440306 H48802 AA1 57843 
AA715629 AW973788 AA932493 A1347563 AA1 81 309 T67880 AA643033 AW467498 AA1 1 5904 AA93541 0 AA483032 
AA084568 W25246 AI567588 AA155732 AA158614 AA888319 AA158568 AA188422 AI309183 AA084817 AA157995 
AI859659 AA188008 AI287379 AI540675 AA085212 AW028391 AA173297 BE256792 AA182854 BE378771 BE538571 
AA079037BE281597 AA643926 W81011 AA1 59344 AA320691 AA877597T57107 AW263819 AI690413 A1619605 A1687579 
AA970560 AI368942 AI927104 AW419220 AI620051 AA128490 AA 120825 AA079520 AA199648 AW188403 BE045224 
AW265533 AA074338 M102685 AW779399 AA192451 AA182771 AW366812 BE281418 AA21 1094 AA131073 AA487924 
AW674848 AI568103 AA171934 F30349 AW088785 AA581370 AA205482 AW352296 AW517565 AI376249 AA1588B4 
A1340509 T59965 AA085193 AA071570 AI874045 AA852755 BE045217 AW189428 AA21 1 141 AA652134 A1497729 
AA994817 AI81 1459 BE535857 AW769697 AW167892 AW149305 AI864981 AW272126 AW023245 AI439266 AI953196 
AA160912 AI718580 BE537547 AA501448 AA069308 L07393 AA353007 AA079235 AI539140 AA740154 W5B341 AA888403 
BE299000 AA196413 BE613327 BE261523 AA866599 AW844713 AI691 159 AI079975 AW327479 BE180731 AA984805 
AW500732AW504061 
AA774672AW504164 
AA769074 AA570769 AA808585 AA808682 
AW505368 AA218610 F11852 T65345 AA397806 
BE297711 AW505574 AA704983 
F07942T08033 

BE386266 BE148823 T23215 AI906290 AA299906 BE2071 97 AW0741 14 AI760368 AI005358 AW662201 AA188988 
AI690711 AA775103 AW072931 AI684269 AW129364 AW615634 AI049941 AW874040 AI352633 AA188989 AI287775 
AA868774AA599660 
AA780365 AA909233 AI275542 
AA210878AA215684R11101 

M13560 AA336951 AA161015 R72814 T69687 R75705 T61319 AA158454 R50579 T55649 AI214156 T70375 R31655 
H64997 AW800487 H491 10 AA834206 H42384 H21783 AI560152 AA664230 H42302 R46708 AA013277 T61901 T92417 
AA875985 T61962 T63055 AA430725 AA458964 AA578746 AI582385 T63000 AI499875 H64998 AA022538 Al 364804 
AI88521 1 AI439714 AI224059 AC49917 T59258 AA477805 AA715834 AA916120 R38304 R35899 R82985 H25524 H82984 
AW516728 T54642 AA079866 H27555 AA455820 T63919 R79450 A1431241 AA937349 AA127213 AA421729 H61 196 
T63894 AA013050 AA079133 W96364 AA487926 AI762796 H26377 AI433386 AJ865423 AW371475 R98189 AA643978 
AI71B204 AW381954 AI862735 
319538 226485 J AA323758 R12731 R14082 

320257 163534 J R17531 AW960899 AA338366 AW673294 BE047729 BE047722 AA330746 AW841797 H05030 A1142105 R12654 

320289 115941J H07989 AJ239462 H24544 AA078369 R74153 

304703 33971 42 BE512926 BE304794 AA129140 AA052922 AA092258 BE378058 BE615391 BE615218 BE616188 AC14126H05675 

W56857 AI028525 BE617241 BE531271 AW856227 T56489 AA322005 AW794148 AF170577 BE615738 AA005138 L76930 
L76932 L76933 X95410 AW389462 BE563092 AW997937 AA263158 AJ520992 AW947350 AA522535 AW945921 AV653776 
AW884835 AW947338 AI687178 AW945799 AI905627 AW948449 AV653751 AW945924 AA563898 AW945810 AW945832 
AW371449 AW945864 AW948447 AW945910 AA643002 AA522680 AA522715 AA578840 AA523279 AA826150 AW945809 
AW405998 AA551909 R23173 AA595545 AW389497 AI933770 Al 125053 A1471803 AW795856 AW796937 W30675 H7031 7 



303701 1155179,1 
303759 447287J 
303773 356632 J 
303778 174437J 
303784 414659J 
303845 5021 1J 
303898 162688_3 



20121 452027J 
319590 171338J 
305186 17456.1 
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321039 26338_2 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



306051 19085.3 



321163 171122J 
321235 1102181J 
320603 4297J 



320641 185591J 
320651 58648J 
321325 28266 J 

305704 464759.-1 

322011 23158J 

306407 

306454 

306516 

306518 

306526 

306534 

306590 

306591 

306631 

306654 

306786 

306799 

308023 

308070 

308099 

306805 

306814 

306873 

306911 

306982 

308238 

308258 

3082B9 

308311 

308332 

308511 

308601 

308612 

308636 

308814 

308851 

308981 

310570 1071946.1 
305022 
305060 
305070 



H68296 T59240 AA397650 H59852 AA938072 AA978010 R35643 T89735 AW361585 AW196153 AI538069 AA604540 
A1434259 R49181 T58717 AW062486 AW796966 AJ648384 R77733 A1623502 BE171342 BE171303 R35658 AW974883 
AW149898 AI500045 AI540710 AI540392 AW009172 AW277199 AI371312 AI500096 A1470297 AW372940 AW844562 
AW844560 AW797965 AI691 146 X07062 AW799199 H60666 AA837684 AF130734 T25952 AI933771 AI91 4850 AW39 1925 
AW793843 AW795012 AW366709 AW750987 AW750985 R35765 AW844942 AW750986 H64920 R34651 X86703 
BE018103 BE018083 BE293253 AW247083 BE207643 BE514793 BE183238 AA376427 AW273850 AW043786 BE439973 
AL045428 A1889050 AA026496 AI422924 AI884485 W96068 AA020872 F371 19 AA714378 AA021 107 AA01 1 141 A1554001 
AI375841 AI469097 AA335219 AW967315 AI692177 AA410448 AI568858 AA582647 AA026419 AA281639 AW515248 
AW007777 AA01 0840 AW 188439 A1805423 AI148210 BE301590 AA744414 AA745392 AW167423 AA622659 AW000878 
AI432387 AA760930 BE047189 AA021605 AV658045 AI093347 AA588594 H63143 AA639556 AI308976 AA379270 
AA633407 AI874329 AI206484 AI493895 AI694103 AI249682 AA973765 AA872445 AI125446 AA287272 AW069761 
AA682569 AW009712 BE542774 R50167 BE301574 AA991202 AA502006 AI219819 AW074373 AA617996 AI521242 F25241 
AW615812 R16774 AA335218 AW673800 H26778 A1468557 AI886986 AI560759 AI460075 AA502968 AA503273 AA610680 
AA287274 AA554020 AA284889 AA916636 AW469457 AW273250 AW673708 AW512948 AL041071 AI446042 AA903535 
BE172441 AI28241 1 AW265021 AA810799 A1559865 M729332 AW00461 1 AW129451 AA659019 BE208239 AA610825 
H03511 BE383995 R16474 AA281701 AW009244 AA287424 AA558139 AW364081 

F08147 AW408359 AW949429 R23785 AW247442 AA305512 T29095 AA905130 BE246361 BE244981 AA220199 BE504058 
X80878 AA533727 AA608601 AW005964 AI81 1627 AI367037 AI277985 AI493719 A1277848 AA854982 AW247298 AI216345 
AI041295 AI887378 AA781241 AI674270 AW628959 AI383083 BE504391 AA729421 AA552188 AA373387 AW880360 
AW875262 AW875369 AW581540 AW875358 AW581568 R23735 AW134768 
W03912 AW971410 AA506385 AA209530 H73495 H48629 W56149 
H56752 AW340364 N49521 

AA853680 AK001668 BE386425 BE563549 BE296124 BE298950 R51419 U46295 BE147292 AA360056 R48018 AW845348 
N47383 AI817280 AI671902 AA988104 AA479464 N56996 AI192374 AI927558 AA659888 A1799903 AA548397 AI161167 
AI656333 AI418829 AW592671 BE327906 AW513346 A1888579 AW469410 AW512809 D25682 AA576079 AA479354 
T30342 R51307 T16044 H29063 AW079357 AI339477 R47914 AI986068 AI870065 AI868489 AI521099 AI582732 AA995540 
AW957299 AA352608 AA676752 AA41O510 AA358874 AI865724 AA853679 A1699265 AW188789 N47380 AA233715 
BE258194 R55421 R55643 H42362 AA243884 

AW886407 AA489268 R57015 R58094 BE077459 BE077423 BE546995 AW849216 T69383 AW9381 1 1 H60337 BE221073 
AB033100 AA347036 BE260325 AW961669 AL047207 AA347037 AI766894 AA601045 AI559897 AW139033 AW274622 
AW172884 AW089070 AA804340 AW798925 
AA825266 

AL1 37354 AL043375 

AA971985 

AA977992 

AA989542 

AA989598 

AA989713 

AA991487 

AI00Q246 

A1000248 

AI001149 

AI003654 

AI041589 

AI051696 

AI452732 

AI470948 

AI475914 

AI055966 

AI066577 

AI086929 

AI095365 

AI127B83 

AI559492 

AI565612 

AI571211 

AI581855 

AI591235 

AJ687580 

AI719930 

AI735634 

AI744063 

AJ819263 

AI829820 

AI873242 

AI318327AI318328 A13 18495 

AA627416 

AA635771 

AA639783 

255 



WO 02/30268 



PCTAJS01/32045 



305079 
305134 
303977 
305216 
305263 
305266 
305396 
305403 
305488 
305549 
305601 
305610 
305621 
305710 
305724 
305744 
305752 
307018 
307055 
307058 
305801 
305830 
305836 
305852 
305858 
305866 
305867 
307126 
305903 



AA641329 

AA653159 

AW512978 

AA569056 

AA679467 

AA679772 

AA721052 

AA723748 

AA749000 

AA773530 

AA780975 

AA782319 

AA789095 

AA826544 

AA827608 

AA831819 

AA835278 

AI140639 

AI148477 

AJ148709 

AA845997 

AAB57665 

AA858043 

AA862455 

AA863103 

AA864533 

AA864572 

AI184951 

AA873085 



328803 c_7_hs 
328809 c_7_hs 
305949 AA884409 
328829 c_7_hs 
330021 c16 _p2 
330024 Cl6_p2 
330028 c16_p2 
330049 c17j)2 
305993 AA889197 

330095 c19_p2 

330096 c19jj2 

307205 AI192479 
307427 AI243437 
307491 AI268539 
307581 A1284415 
307588 AI285535 
337672 C«2_6002FG_UNK_EMAC00 
337693 CH22^6030F6_L1NK_EM^COO 
337738 CH22_6083FG_UNK_EM^C00 
307692 AJ318342 
307806 AI351739 
309107 A1925823 
309230 AJ970747 
339338 CH22.8300FG_UNK_BA354I1 
309257 A1984183 
309366 AW072970 
309422 AW087175 
325207 c10_hs 
325257 c11_hs 

309646 AW194694 
309651 AW195850 
325313 c11_hs 

309924 AW340812 
334030 CH22.1308FG_320_2_UNK^EM 
334040 CH22.1 31 8FG_322_8_UNK_EM 
334083 CH22 J 36 1 FGJE7_38JJNK_E 
332810 CH22_26FG_7_12_UNKJS5E1 
302747 32813J AF062275 L03830 

302753 33029.1 M74299 M743Q2 M74303 

302777 33803.1 AJ230640 AJ230648 
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304094 
302824 
302996 
325870 
304240 
304410 
304443 
304475 
304522 
304678 
304705 
306004 
306008 
306013 
306082 
336174 
306094 
304823 
304872 
304918 
304955 
306249 
306286 
306295 
306317 
306347 
306365 
306398 
330401 
330463 



330535 
332634 



35372.1 
41196J 
c16_hs 



H11295 

U21260 U21256 
AF054663AF124197A7G292 

AA009802 
AA284508 
AA399444 
AA428879 
AA465405 
AA548556 
AA564064 
AA889992 
AA894390 



AA908508 
CH22_3567FG_71 0_1_LINK_DA 
AA908877 



entrezj)28383 
460.2 



1374_-8 
10404_2 



AA584837 
AA595289 
AA602697 
AA613504 
AA933840 
AA936892 
AA937331 
AA947909 
AA961144 
AA962086 
AA970548 
D28383 

NM_001055 AA332948 U26309 U09031 L19955 L10819 A1366043 X84654 U71086 AV654451 AJ007418 AA053625 

BE1 68856 AA376730 H12694 AA810348 AA621972 AI818950 AV645367 A1819966 AA910602 AW512449 H67893 AI310497 

AI304330 AI339217 AW193588 AW438688 AI818970 AW316799 AA906527 AA777570 N47673 AI336428 AW945133 

AI038606 R29692 AW194197 AI304748 H12639 AA053178 AA493213 AA676958 AA1 13154 AI313469 AI368239 R93183 

W24532 U52852 U54701 AL046864 AA365795 

U11872 

U24488NMJXJ7116 
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TABLE 13B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 13. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers 

Strand: Indicates DNA strand from which exons were predicted. 

Imposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

332791 Dunham, I. etal. 

332792 Dunham, I. etal.' 
332810 Dunham, I. etal 
332944 Dunham, I. etal. 
332972 Dunham,!. etal. 
333133 Dunham, I. etal. 

333154 Dunham, I. etal. 

333155 Dunham, I. etal. 
333227 Dunham, I. etaL 
333230 Dunham, I. etat 
333298 Dunham, I. etal. 

333304 Dunham, I. etal. 

333305 Dunham, I. etal. 
333365 Dunham, I. etal. 
333383 Dunham, I. etal. 

333391 Dunham, I. etat 

333392 Dunham, tetat 
333397 Dunham,!. etal. 
333403 Dunham, I. etat 
333413 Dunham, I. etal. 
333445 Dunham, I. etal. 
333479 Dunham, I. etal. 
333481 Dunham, I. etat 
333483 Dunham, I. etat 

333516 Dunham, I. etal. 

333517 Dunham,!. etal. 

333518 Dunham, I. etat 
333531 Dunham, I. etaL 
333566 Dunham, I. etal. 
333572 Dunham, I. etal. 
333586 Dunham, I. etal 
333588 Dunham, I. etal. 

333594 Dunham, I. etal 

333595 Dunham, I. etal 

333600 Dunham, I. etat 

333601 Dunham,!, etal. 
333607 Dunham, I. etal. 

333612 Dunham, I. etal. 

333613 Dunham, I. etal. 

333614 Dunham, I. etat 
333624 Dunham, I. etal. 
333626 Dunham, I. etal 
333635 Dunham, I. etal 
333637 Dunham,!. etal 
333642 Dunham, I. etal 
333647 Dunham, L etal 

333653 Dunham, I etal. 

333654 Dunham, I etal. 

333656 Dunham. I. etat. 

333657 Dunham, I. etal 

333658 Dunham, I. etal 



Strand 


NLpostUon 


Plus 


72720-73315 


Plus 


73381-73768 


Plus 


304295-304384 


Plus 


2414825-2414932 


Pius 


2572152-2572236 


Plus 


3360058-3360195 


Plus 


3615887-3616019 


Plus 


3616832-3617003 


Plus 


3992866-3992968 


Plus 


3995507-3996507 


Plus 


4581537-4581947 


Plus 


46299434630242 


Plus 


46303884630645 


Plus 


47868834787283 


Plus 


49071794907277 


Plus 


49166974916780 


Pius 


49182944918433 


Plus 


49224664922635 


Plus 


49251404925256 


Plus 


49438244943974 


Plus 


5097827-5097885 


Plus 


5272855-5272939 


Plus 


5286358-5286505 


Plus 


5297945-5298105 


Plus 


5570204-5570390 


Plus 


5570729-5570925 


Plus 


5571761-5572025 


Plus 


5622622-5622684 


Plus 


5954226-5954473 


Plus 


6026896-6027189 


Plus 


6246834-6247314 


Plus 


6255445-6255779 


Plus 


6308990-6309450 


Plus 


6323103-6323348 


Plus 


6355629^355925 


Plus 


6360075-6360442 


Plus 


6504431-6504690 


Plus 


654956^6549697 


Plus 


6550643-6550748 


Plus 


6551227-6551389 


Plus 


6595146-6595244 


Plus 


6614174-6614467 


Plus 


6663633-6663973 


Plus 


6674968-6675134 


Plus 


6708760-6709139 


Plus 


6772502-6772779 


Pius 


681 1130^81 1392 


Plus 


6816731-6816993 


Plus 


6822087-6822406 


Plus 


6831369-6831445 


Plus 


6835282-6835474 
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333659 Dunham, I. etal. Plus 
333684 Dunham, L etal Plus 
333686 Dunham, I. etal Phis 

333697 Dunham, I. etaL Plus 

333698 Dunham, LetaL Plus 

333699 Dunham, LetaL Plus 
333703 Dunham, I. etaL Plus 
333709 Dunham, I. etaL Pius 
333747 Dunham, I. etaL Plus 

333774 Dunham, I. etal. Plus 

333775 Dunham, I. etaL Pius 
333806 Dunham, LetaL Plus 
333843 Dunham, I. etaL Plus 
333854 Dunham, LetaL Plus 
333873 Dunham, I. etaL Plus 
333880 Dunham, LetaL Plus 
333885 Dunham,!. etaL Plus 
333918 Dunham, I. etal Plus 
333947 Dunham, LetaL Pius 
333961 Dunham, LetaL Pius 
333981 Dunham, LetaL Plus 
333991 Dunham, I. etal Plus 
333994 Dunham, I. etal Plus 
334030 Dunham, I. etaL Plus 
334083 Dunham, I. etal Plus 
334111 Dunham, I. etal Plus 
334135 Dunham, I. etal Plus 
334218 Dunham, 1 etal Plus 
334249 Dunham, LetaL Plus 
334262 Dunham, LetaL Plus 
334264 Dunham, LetaL Plus 

334327 Dunham,! etal. Plus 

334328 Dunham, LetaL Plus 
334340 Dunham, LetaL Plus 
334454 Dunham, LetaL Plus 
334504 Dunham, I. etal. Pius 
334508 Dunham,!. etal. Plus 
334512 Dunham, LetaL Plus 
334582 Dunham, LetaL Plus 
334659 Dunham, LetaL Plus 
334721 Dunham, LetaL Plus 
334723 Dunham, LetaL Plus 
334730 Dunham, LetaL Pius 
334774 Dunham, LetaL Plus 
334778 Dunham, L eta!. Plus 
334851 Dunham, I. etal. Plus 
334885 Dunham, LetaL Plus 
334902 Dunham, LetaL Plus 

334905 Dunham, LetaL Plus 

334906 Dunham, LetaL Plus 
334910 Dunham, LetaL Plus 
335018 Dunham,! etal Pius 
335025 Dunham, I. etal. Plus 
335033 Dunham, LetaL Plus 
335044 Dunham, LetaL Plus 
335142 Dunham,!. etal Plus 
335157 Dunham, L etal Plus 
335160 Dunham, LetaL Plus 
335174 Dunham, LetaL Plus 
335188 Dunham, I.eLal Plus 

335190 Dunham, LetaL Plus 

335191 Dunham,! etal. Plus 
335193 Dunham, I. etaL Plus 
335204 Dunham, LetaL Plus 
335222 Dunham, LetaL Plus 

335226 Dunham, LetaL Plus 

335227 Dunham,!. etal Plus 

335309 Dunham, L etal Phis 

335310 Dunham, 1 etal. Plus 



6836179-6836248 

7169561-7169742 

7177117-7177302 

7203859-7203934 

7205279-7205383 

7206101-7206175 

7215559-7215663 

7229730-7229835 

7605884-7606206 

7716509-7716636 

7729983-7730149 

7877475-7877666 

7978762-7978887 

8029446-8029524 

8133266-8133429 

8151923-8152133 

6154352-8154437 

8307124-8307215 

8579888-8579966 

8617999-8618104 

8782374-6782643 

8837419-8837551 

8852749-8852894 

9288463-9288782 

9837016-9837081 

10279365-10279531 

10457085-10457183 

12680289-12680378 

13190430-13190574 

13231452-13231581 

13234447-13234544 

13577413-13577496 

13589868-13589936 

13642407-13642522 

14326506-14326738 

14510206-14510398 

14514936-14515122 

14545933-14546366 

15026255-15026371 

15460624-15460726 

15796816-15796987 

15805317-15805399 

15967830-15967934 

16251857-16252178 

16276180-16276395 

17820110-17820810 

19233667-19233787 

19317083-19317195 

19322553-19322680 

19323493-19323590 

19398155-19398684 

20688268-20688415 

20743941-20744050 

20753188-20753314 

20842088-20842682 

21465105-21465186 

21543302-21544341 

21573388-21573497 

21631301-21631447 

21669118-21669328 

21680807-21680876 

21681110-21681183 

21692208-21692362 

21750636-21750726 

21885542-21885608 

21890838-21890930 

21892145-21892289 

22500158-22500276 

22500714-22500831 



WO 02/30268 



335311 Dunham, I. etaL Pius 
335355 Dunham,!, etal Plus 
335362 Dunham, I. etaL Pius 
335368 Dunham, I. eta). Plus 

335384 Dunham, I. etaL Pius 

335385 Dunham, I. etaL Pius 
335436 Dunham, I. etaL Pius 

335440 Dunham,!. etaL Pius 

335441 Dunham, I. etaL Pius 
335450 Dunham, I. etaL Plus 
335453 Dunham,!. etaL Plus 
335458 Dunham, t etaL Pius 
335464 Dunham, I. etaL Pius 

335496 Dunham, I. etaL Pius 

335497 Dunham, i. etal. Plus 

335498 Dunham,!. etal. Pius 

335499 Dunham, I. etaL Plus 

335500 Dunham, I. etaL Phis 
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335651 Dunham, I. etaL Plus 
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335663 Dunham, I. etaL Plus 
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335715 Dunham, I. etaL Pius 
335719 Dunham, I. etaL Plus 
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335819 Dunham,!. etaL Plus 
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333327 Dunham, I. eUL Minus 

333335 Dunham, I. etaL Minus 
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333454 Dunham, I. etaL Minus 
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333780 Dunham, L etal. Minus 
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334306 Dunham, L etaL Minus 
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334363 Dunham,!. etaL Minus 
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334409 Dunham, I. etaL Minus 
334414 Dunham, L etaL Minus 
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334498 Dunham, I. etal. Minus 

334501 Dunham, I. etal. Minus 

334502 Dunham,!. etal. Minus 
334543 Dunham, I. etal. Minus 
334622 Dunham, I. etaL Minus 
334650 Dunham, I. etaL Minus 
334680 Dunham, I. etal. Minus 
334745 Dunham,!. etal. Minus 
334756 Dunham, I. etal. Minus 
334758 Dunham, I. etaL Minus 
334761 Dunham, l.etat Minus 
334763 Dunham, L eLaL Minus 
334784 Dunham, I. etal. Minus 
334790 Dunham, I. etaL Minus 
334793 Dunham, I. etal. Minus 
334802 Dunham, I. etaL Minus 
334820 Dunham, L etal. Minus 
334824 Dunham, I. etal. Minus 
334832 Dunham, L etal. Minus 
334842 Dunham, I. etal. Minus 
334844 Dunham, L etaL Minus 
334857 Dunham, I. etaL Minus 
334927 Dunham, L etaL Minus 
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336274 Dunham, I. etai Minus 32085468-32085303 

336318 Dunham, I. etaL Minus 33364452-33364338 

336326 Dunham, I. etaL Minus 33567328-33567201 

336339 Dunham, I. etaL Minus 33798479-33798330 

5 336340 Dunham, I. etaL Minus 33812069-33811915 

336355 Dunham, I. etat Minus 33874750-33874649 

336392 Dunham, I. etaL Minus 34015868-34015736 

336393 Dunham. I. etai. Minus 34016145-34015951 

336394 Dunham, I. etaJ. Minus 34016457-34016298 
10 336400 Dunham,!. etai. Minus 34023437-34023298 

336402 Dunham,!. eta). Minus 34024090-34023981 

336413 Dunham, I. etaL Minus 34046702-34046576 

336424 Dunham, I. etai. Minus 34055549-34055491 

336425 Dunham, I. etaL Minus 34058544-34058446 
15 336437 Dunham, I. etaL Minus 34074154-34074090 

336447 Dunham, I. etaL Minus 34198207-34197996 

336449 Dunham, I etai Minus 34204707-34204577 

336466 Dunham,!. etai Minus 34213195-34213046 

336492 Dunham,!. etaL Minus 34255578-34255437 

20 336511 Dunham,!. etaL Minus 34277480-34277351 

336512 Dunham, I. etai Minus 34278373-34278275 

336520 Dunham,!. etaL Minus 34319184-34319101 

336522 Dunham,!. etai Minus 34320169-34320056 

336524 Dunham,!. etaL Minus 34321055-34320921 

25 336527 Dunham, I. etaL Minus 34322071-34321966 

336534 Dunham J. etaL Minus 34326797-34326620 

336536 Dunham, I. etai Minus 34327678-34327538 

336542 Dunham,!. etaL Minus 34331316-34331183 

336556 Dunham,!. etaL Minus 34375244-34374907 

30 336557 Dunham, I. etai. Minus 34375443-34375341 

336558 Dunham, I. etaL Minus 34375825-34375698 

336559 Dunham, I. etai. Minus 34376430-34376261 

336560 Dunham, I. etai Minus 34376814-34376596 

336561 Dunham, I. etaL Minus 34377168-34376928 
35 336597 Dunham, I. etai Minus 7627912-7627757 

336601 Dunham, I. etaL Minus 13265853-13265654 

336642 Dunham, I. etai Minus 1304281-1304212 

336645 Dunham, I. etai Minus 1351268-1351168 

336662 Dunham, I. etai. Minus 2158060-2157993 

40 336664 Dunham, I. etaJ. Minus 1993558-1993481 

336676 Dunham J. etai. Minus 2022565-2022497 

336684 Dunham, I. etaL Minus 2158060-2157993 

336686 Dunham, I. etai. Minus 2160698-2160486 

336714 Dunham, I. etai. Minus 3094026-3093871 

45 336719 Dunham, I. etai. Minus 3331631-3331503 

336736 Dunham, L etaL Minus 4093128-4093041 

336744 Dunham, I. etaL Minus 4333001-4332848 

336786 Dunham, I. etaL Minus 5419973-5419873 

336793 Dunham, L etaL Minus 5631345-5631237 

50 336859 Dunham, I. etai. Minus 8201756-8201561 

336863 Dunham, I. etaL Minus 8396673-8396425 

336933 Dunham, I. etai. Minus 11760045-11759981 

336942 Dunham, 1. etaL Minus 12027537-12027455 

336960 Dunham, I. etai Minus 13267243-13267172 

55 336969 Dunham, I. etat Minus 13725722-13725643 

336971 Dunham, I. etai. Minus 13732308-13732221 

337003 Dunham, I. etaL Minus 15523541-15523422 

337011 Dunham, I. etaL Minus 16106423-16106080 

337070 Dunham,!. etaL Minus 19034423-19034321 

60 337072 Dunham, I. etaL Minus 19077452-19077323 

337086 Dunham,!. etaL Minus 19657011-19656881 

337140 Dunham, I.etaL Minus 22649450-22649388 

337193 Dunham, I. etat Minus 24594969-24594874 

337256 Dunham, I. etai Minus 27659956-27659876 

65 337278 Dunham, I. etai Minus 28429017-28428848 

337284 Dunham, I. etai. Minus 28491414-28491094 

337293 Dunham,!. etaL Minus 28846334-28845873 

337316 Dunham, I. etaL Minus 29657129-29656997 

337326 Dunham, I. etai Minus 30017199-30017069 
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337382 Dunham, LetaL Minus 31233666-31233579 

337392 Dunham, Let at Minus 31442311-31442229 

337406 Dunham, LetaL Minus 31864840-31664588 

337412 Dunham, LetaL Minus 31916487-31916312 

5 337419 Dunham, LetaL Minus 32021496-32021170 

337438 Dunham, LetaL Minus 32257869-32257739 

337455 Dunham, LetaL Minus 32434517-32434425 

337509 Dunham, LetaL Minus 33414613-33414498 

337518 Dunham, LetaL Minus 33796750-33796647 

10 337529 Dunham, LetaL Minus 34043668-34043546 

337533 Dunham, LeUl. Minus 34193388-34193261 

337539 Dunham, I. eLaL Minus 34254490-34254322 

337551 Dunham, LetaL Minus 34524446-34524362 

337553 Dunham, LetaL Minus 24230-24160 

15 337591 Dunham, LetaL Minus 1006414-1006184 

337592 Dunham. L eLaL Minus 1007791-1007634 

337593 Dunham, LetaL Minus 1009460-1009291 
337607 Dunham, LetaL Minus 1355719-1355637 
337612 Dunham, LetaL Minus 1570235-1570142 

20 337635 Dunham, LetaL Minus 2169690-2169569 

337824 Dunham, LetaL Minus 45595404559266 

337825 Dunham, LetaL Minus 45671554567005 
337850 Dunham, LetaL Minus 5077143-5076943 
337854 Dunham, LetaL Minus 5153435-5153272 

25 337913 Dunham, LetaL Minus 6149843-6149786 

337915 Dunham, Letat. Minus 5922748-5922690 

337968 Dunham, LetaL Minus 7095797-7095680 

338010 Dunham, LetaL Minus 7754282-7754184 

338012 Dunham, LetaL Minus 7761421-7761351 

30 338017 Dunham, L eLaL Minus 7864521-7864401 

338065 Dunham, L eLaL Minus 7235048-7234950 

338094 Dunham, LetaL Minus 9595602-9595440 

338129 Dunham, LetaL Minus 10915338-10915237 

338132 Dunham, LetaL Minus 10989617-10989530 

35 338150 DunhanUetal. Minus 11478551-11478355 

338157 Dunham, LetaL Minus 11731444-11731375 

338195 Dunham, LetaL Minus 13484103-13483972 

338255 Dunham, L eLaL Minus 15242294-15242231 

338276 Dunham, LetaL Minus 16109555-16109398 

40 338431 Dunham. Letat Minus 19747608-19747496 

338448 Dunham, I. etaJ. Minus 20151152-20151054 

338451 Dunham, LetaL Minus 20174286-20174193 

338477 Dunham, LetaL Minus 20821897-20821838 

338534 Dunham, LetaL Minus 21771238-21771170 

45 338682 Dunham, LetaL Minus 24800712-24800461 

338684 Dunham, LetaL Minus 24827522-24827428 

338689 Dunham, LetaL Minus 24893073-24892972 

338695 Dunham,!. eLaL Minus 25104153-25104016 

338825 Dunham, LetaL Minus 27664798-27664712 

50 338842 Dunham, LetaL Minus 27824238-27824079 

338893 Dunham, LetaL Minus 28491807-28491631 

338904 Dunham, LetaL Minus 28766345-28766253 

338935 Dunham, LetaL Minus 29071537-29071461 

339022 Dunham. LetaL Minus 30523414-30523289 

55 339034 Dunham. LetaL Minus 30621603-30621422 

339190 Dunham, LetaL Minus 32403103-32402985 

339212 DunhanUetal. Minus 32494335-32494210 

339213 Dunham, LetaL Minus 32496590-32496440 
339216 Dunham, LetaL Minus 32504250-32504109 

60 339233 Dunham, L eLaL Minus 32751331-32751238 

339258 Dunham, LetaL Minus 32934756-32934615 

339262 Dunham, LetaL Minus 32971258-32971090 

339263 Dunham, LetaL Minus 32974634-32974452 
339265 Dunham,!. eLaL Minus 32975943-32975806 

65 339338 Dunham, LetaL Minus 33468728-33468606 

339396 Dunham, LetaL Minus 34017306-34017205 

339400 Dunham, LetaL Minus 34045024-34044940 

339425 Dunham, L eta!. Minus 34407911-34407798 

325207 6552430 Plus 140049-140170 
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329568 3962490 
329517 3983513 
325313 5866865 
325327 5866875 
5 325317 5866878 
325257 5866895 
329632 6729060 
325371 5866920 
325375 5866920 
10 325378 5866920 

325469 6017034 

325470 6017034 
325576 6552443 
325505 6682451 

15 325543 6682452 

329635 5302817 

329636 5302817 
325593 5866992 
325675 5867014 

20 325704 5867028 
325682 6138923 
325785 6381957 
325666 6469822 
325818 6682490 

25 329777 6002090 
329768 6015501 
329759 6048280 
329731 6065783 
329687 6117856 

30 329676 6272128 
329667 6272129 

329669 6272129 

329670 6272129 
329641 6468233 

35 329791 6469354 
325826 5867048 
325829 5867052 
329888 6067149 
329893 6525313 

40 329899 6563505 
325988 5867064 
325855 5867067 
325999 5867073 
326001 5867073 

45 325886 5867087 
325882 5867087 
325905 5867104 
325922 5867122 
325937 5867132 
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60 329916 6223624 
330021 6671889 
330024 6671908 
330028 6671908 
326033 5867178 

65 326036 5867178 
326056 5867184 
326116 5867193 
326122 5867194 
326138 5867203 



Plus 


36331-36750 


Minus 


53197-53269 


Minus 


27385-28192 


Plus 


75189-75264 


Minus 


15655M56649 


Plus 


10867-10955 


Plus 


192813-193017 


Minus 


1035422-1035536 


Minus 


1165503-1165810 


Minus 


118798M 188167 


Plus 


286823-286991 


Plus 


287578-287663 


Minus 


137769*137894 


Minus 


240852-240946 


Plus 


151873-152057 


Minus 


62522-62622 


Minus 


64969-65078 


Minus 


469726-469860 


Plus 


955517-955711 


Plus 


156198-156387 


Plus 


370618-370763 


Plus 


61849-62003 


Plus 


16769-16857 


Minus 


120278-120559 


Minus 


191389-191479 


Plus 


118315-118422 


Mint t-i 

[YIIIIU3 


37647-37730 

W#W*ff Kit IWU 


Plus 


158772-158900 


Minus 


22165-22288 


Minus 


142207-142359 


Plus 


101355-101745 


Pius 


131223-131291 


Plus 


131351-131495 


Minus 


105995-106107 


Minus 


131982-132089 




46361-46458 


Plus 


232674^233060 


Minns 


37227-37473 


Minui 


166123-166791 


Minus 


111058-111783 

1 1 1 wv 1 1 II W 


Plus 


17349-17606 


Plus 


276141-276251 


Plus 


149115-149192 


Plus 


155223-155348 


Plus 


194694-194915 


Minus 


8178-8347 


Plus 


7877Q-78fl76 


mm Uo 




Minus 


15?633-1 52*502 


Minus 


162506-162635 


Minus 


iR5infi-ifi*i?nQ 

1 DO luv* lUhU.ua 


Phic 
rlUa 


171451-17153? 


Plus 


181964-182037 




1AA3ft0.1AA5A7 


Minus 


14188-14332 


Plus 


228209-228297 


Minus 


139780-139890 


Minus 


62584-62691 


Minus 


69059-69127 


Plus 


36396-37195 


Plus 


120938-121032 


Minus 


1005-1270 


Minus 


30015-30144 


Plus 


37261-37333 


Minus 


120215-120273 


Minus 


181553-181690 


Plus 


45548-45604 


Plus 


144397-144683 


Minus 


179374-179436 



267 



WO 02/30268 PCT/US0 1 /32045 



326145 5667204 
326180 5867211 
326201 5867216 
326207 5867222 
5 326226 5867230 
326233 5867232 
326238 5867260 
326241 5867260 
326243 5867261 

10 326251 5867263 
326268 5867267 
326124 5916395 
326339 6056311 
330049 4567182 

15 326358 5867293 
326365 5867297 
326379 5867327 
326382 5867327 
326390 5867340 

20 326424 5867369 
326453 5867399 
326472 5867404 
326492 5867422 
326533 5867441 
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Iwl 1 W ■ \J 1 utw 


IVWIU9 


9817-9885 


Pius 


28227-28413 


Pius 


117442*118283 

1 if ■•"it • • ucoo 


Minus 


119761-119911 


Minus 


26413-26820 


Minus 


27522-27614 


Minus 


19855-19962 


Minus 


32819-32939 


Pine 


1BQ7 1-19030 

1 05? ill JAAAJ 






Mimic 


13108-13225 

tot wr i juj 


Plus 


133238-133339 


Minus 


222629-222709 


Plus 


392666-392746 


Plus 


52356-52694 


Minus 


116524-116662 


Plus 


290842-290905 


Plus 


614823-615209 


Plus 


721390-721470 


Plus 


144569-144712 


Minus 


38950-39301 


Minus 


68948-69041 


Plus 


362196-362344 


Plus 


84776-84899 


Plus 


97697-97771 
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TABLE 14: shows genes, including expression sequence tags, down-regulated in prostate 
tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos Hu02 
GeneChip array. Shown are the ratios of "average" normal prostate to "average" prostate 
cancer tissues. 



Pkey: 
ExAccn: 



Unigene Title: 
R1: 



Unique Eos probeset Identifier number 

Exemplar Accession number, Genbank accession numbe 

Unigene number 

Unigene gene title 

Background subtracted norma! prostate : prostate tumor tissue 



Pkey ExAccn UnlgenelD Unigene TiUe 



331328 
320875 
300994 
323461 
301015 
319419 
323486 



AA281133 

D60641 

AI251936 

AA418762 

AA947682 

AA543096 

C05278 



324882 AW419080 
330569 U57796 
330126 

316265 AA737400 
323045 AA148950 



330769 
312614 
314790 
309979 
314236 
329192 
324307 
303685 
314921 
315840 
332776 
313533 
303494 
317490 
332546 
334719 
300679 
311811 
315310 
312871 
324715 
313870 
321453 
316160 
313833 
315850 
303124 
323346 
301383 
324513 
303480 
323591 
313603 
317853 
312381 
317514 
319750 



AA465192 

AI766732 

AW341754 

AW452118 

AA743396 

AA627642 

AW500106 

AW452382 

AA679001 

AA034364 

AW298141 

F30712 

AI627358 

D84454 

AA813958 

AI625304 

AW511298 

H86747 

AI739168 

AW206435 

N50080 

AW197887 

AA766825 

AW270550 

AF161350 

AL134932 

AA913591 

AW501678 

AA331906 

AA301270 

AW468119 

AT733395 

R42049 

AW451570 

AA621606 



Hs.131921 

Hs.146298 

Hs.190044 

Hs.217173 

Ms.13648 

Hs.166800 

Hs.250645 
Hs.57679 

Hs.142230 

Hs.188836 

Hs.146217 

Hs.16514 

HS201194 

Hs.189305 

Hs.257533 

Hs.189023 

Hs.4994 

Hs.257564 
Hs.192221 
Hs.256551 
Hs.157975 

Hs.148367 
HSJ21899 

Hs.207727 
Hs.190312 
Hs256067 
Hs£27602 

Hs.146057 
Hs.1 17827 
Hs253353 

Hs.1 16957 

Hs.143607 
Hs.126480 
Hs.164577 



Hs.129124 
Hs.195473 
Hs.126850 
Hs.1 17956 



ESTs 
ESTs 
ESTs 
ESTs 

ESTs; Weakly similar to Chain A; Cdc42hs-Gdp Complex [H.sapiens] 

ESTs; Highly similar to mitogen-induced [M .muscuius] 

ESTs; Moderateiy similar to [PYRUVATE DEHYDROGENASE(UPOAMfDE)] 

KINASE ISOZYME 4 PRECURSOR [Rsapiens] 

ESTs 

zinc finger protein 192 

CH.21_p2gi|6093735 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

EST 

ESTs 

CH.XJisgil5868716 

transducer of ERBB2; 2 (TOB2) 

EST cluster (not in UniGene) with exon hit 

ESTs 

ESTs 

ESTs; Weakly similar to !!!! ALU CLASS B WARNING ENTRY 110 [Rsapiens] 
ESTs 

EST duster (not in UniGene) with exon hit 
ESTs 

sofute carrier family 35 (UDP-galactose transporter); member 2 
CH22J=GENES.421_30 

ESTs; Moderately similar to KIAA0071 [Rsapiens) 

ESTs 

ESTs 

KIAA11 16 protein 

EST cluster (not in UniGene) 

ESTs 

ESTs 

ESTs 

EST cluster (not in UniGene) 
ESTs 

EST duster (not in UniGene) with exon hit 

ESTs 

ESTs 

ESTs 

EST duster (not In UniGene) with exon hi) 

EST cluster (not in UniGene) 

EST duster (not in UniGene) 

ESTs 

ESTs 

ESTs 

ESTs 



R1 

18.53 
1435 
12.17 
1055 
10.17 
9.2 

8.87 
8 

7.88 

7.8 

7.7 

7.64 

7.4 

7.15 

7 

6.83 

6.74 

6.49 

6.1 

5.99 

5.82 

5.8 

5.68 

5.43 

5.4 

5.35 

5.31 

525 

525 

522 

522 

5.19 

5.11 

4.97 

4.97 

4.78 

4.63 

4.58 

453 

4.46 

4.4 

4.35 

428 

4.25 

422 

K2 

4.1 

4.08 

4.03 

4.03 
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322520 T55958 EST duster (not In UniGene) 4 

314754 AW026761 Hs.134374 ESTs 4 

316088 AI990652 Hs508973 ESTs 4 

31B473 AI939339 Hs.146883 ESTs 3.96 

5 307848 A1364186 EST singleton (not in UniGene) with exon hit 3.95 

300730 AW449204 Hs557125 ESTs 3.94 

303034 W60843 Hs.31570 ESTs 3.93 

324668 AI679131 Hs501424 ESTs 3.9 

324674 AA541323 Hs.1 15831 ESTs 3.88 

10 300547 N53442 Hs.143443 ESTs 3.83 

316100 AW203986 H$5130Q3 ESTs 3.79 

314801 AA481027 Hs.127336 ESTs; Weakly similar to ORF YGR245C [S.cerevisiae] 3.75 

32085$ D59945 EST cluster (not in UniGene) 3.74 

313188 AI0397Q2 Hs.179573 collagen; type (; alpha 2 3.73 

15 314187 AA804409 Hs.1 18920 ESTs 3.73 

311826 AA765470 Hs.122826 ESTs 3.7 

302358 D81150 EST cluster (not in UniGene) with exon hit 3.68 

311441 Z38720 Hs.151014 ESTs 3.66 

321914 AA011603 EST cluster (not in UniGene) 3.59 

20 332216 H95082 Hs.102332 EST 3.52 

324771 AA631739 EST cluster (not in UniGene) 3.5 

323691 AA317561 EST cluster (not in UniGene) 3.49 

303525 AW516519 Hs.1 15130 ESTs 3.47 

309709 AW242630 EST singleton (not in UniGene) with exon hit 3.46 

25 300038 AFFX control: MurlL4 3.38 

316526 AI088192 Hs.135474 ESTs; Weakly similar to ATP-DEPENDENT RN A HELICASE A [H .sapiens] 3.36 

313029 AA731520 Hs.170504 ESTs 3.35 

304356 AA196027 Hs.195188 glyceraldehyde-3-phosphate dehydrogenase 3.34 

314610 AI948688 Hs.191805 ESTs 3.33 

30 329815 CH.14j>2gi[6624888 3.32 

314949 AI745387 Hs539124 ESTs 3.31 

300598 N53574 Hs.158932 ESTs 3.3 

329218 CRXJis gi|5868726 3.28 

315706 AW440742 Hs.155556 ESTs 3.28 

35 303751 AW503637 EST cluster (not in UniGene) with exon hit 3.25 

307783 AI347274 EST singleton (not in UniGene) with exon hit 3.25 

321414 AA324975 Hs.128993 ESTs; Weakly similar to KIAA0465 protein [H.sapiens] 3.25 

312187 AA700439 Hs.168490 ESTs 355 

334061 CH22.FGENES.327J4 323 

40 336036 CH22_FGENES.678_7 353 

321477 H67818 Hs522Q59 ESTs 351 

315760 AW139383 Hs545437 ESTs 35 

316733 AA811713 Hs.163222 ESTs 35 

300855 AW235248 Hs.79828 ESTs 35 

45 323611 AA304986 Hs.145704 ESTs 3.19 

314138 AA740616 EST cluster (not in UniGene) 3.17 

316774 AA814859 EST cluster (not in UniGene) 3.16 

308884 AI833131 Hs.179100 ESTs 3.11 

331317 AA258222 Hs.87757 ESTs 3.1 

50 317221 AI989538 Hs.191074 ESTs 3.08 

316386 AA749062 Hs.180285 ESTs 3.08 

321040 H26953 EST duster (not in UniGene) 3.08 

308828 AI824829 EST singleton (not in UniGene) with exon hit " 3.08 

300778 AA236233 Hs.188716 ESTs 3.07 

55 316667 AW015940 Hs532234 ESTs 3.07 

324614 AW503101 EST cluster (not in UniGene) 3.07 

316468 AW293046 Hs555158 ESTs 3.07 

300671 AI239706 Hs.189886 ESTs 3.06 

314301 AW297967 Hs.188181 ESTs 3.05 

60 312335 AW043620 Hs536993 ESTs 3.03 

322957 AA247755 EST duster (not in UniGene) 3.01 

316848 AA830053 Hs.12$798 ESTs 3.01 

313473 AA009660 Hs551948 ESTs; Moderately simBar to T07D3.7 [C.elegansJ 2.99 

318518 T27119 EST duster (not in UniGene) 2.98 

65 313383 AI076370 Hs.134037 ESTs 2.97 

331389 AA458637 Hs.152207 ESTs 2.96 

304257 AA053294 EST singleton (not in UniGene) with exon hit 2.95 

309917 AW340014 EST singleton (not In UniGene) with exon hit 2.95 

319661 K08035 Hs51398 ESTs; Moderately similar to PUTATIVE GLUCOSAMINE-6-PH0SPHATE 
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ISOMERASE(H.sapiensj 2.95 

321253 A1699484 EST cluster (not in UniGene) 2.93 

321193 AA149508 Hs.103288 ESTs 2.93 

332864 CH22_FGENES28_4 2.92 
5 300Q27 

Ml 1507 AFFX control: transferrin receptor - 2-91 

324330 AA884766 EST cluster (not in UniGene) 2.88 

320014 AA137114 Hs.170291 ESTs 2.88 

333916 CH22_FGENESJ296_5 2.88 

10 318885 Z43272 EST cluster (not in UniGene) 2.87 

318146 A1040125 Hs.150521 ESTs 2.87 

323348 AA233056 Hs.191518 ESTs 2.85 

305703 AA825148 Hs.21229 F-box protein Fbwlb 2.84 

335862 CrC2 FGENES.629.7 2.83 

15 317672 AW205409 Hs.127748 ESTs 2.82 

323416 AI610397 Hs.159560 ESTs 2.81 

312652 At419909 Hs.160994 ESTs 2.81 

324094 AA382603 EST cluster (not in UniGene) 2.81 

319761 R84237 EST cluster (not in UniGene) 2.8 

20 317013 AAB64468 Hs.135646 ESTs 2.8 

317383 AA913887 Hs.126511 ESTs 2.78 

314659 AW277121 Hs*54881 ESTs A 2.78 

312479 AI950844 Hs.128738 ESTs; Weakly similar to non-tens beta gamma-crystallin fike protein [H.sapiens] 2.77 

332808 CH22 FGENES.7J0 2.75 

25 311824 AW293826 Hs^50610 ESTs 2.75 

321992 C06003 Hs.116456 ESTs 2.73 

316074 AW517542 Hs.208382 ESTs 2.73 

309839 AW296076 EST singleton (not in UniGene) with exon hit 2.73 

312071 AA683529 Hs.143119 ESTs 2.73 

30 312684 AW294G20 Hs.1 17721 ESTs 2.72 

332668 AA062971 Hs.181161 ESTs; Weakly similar to INHIBITOR OF APOPTOSIS PROTEIN 1 [M.musculus] 2.72 

322139 H53744 EST cluster (not in UniGene) 2.72 

304168 H77679 EST singleton (not in UniGene) with exon hit 2.72 

325602 CH.13_hsgi|5866994 2.71 

35 319885 R59096 Hs.136698 ESTs 2.71 

300611 N75450 EST cluster (not In UniGene) with exon hit 2.71 

316854 AA831215 Hs.159066 ESTs; Weakly similar to predicted using Genefinder [Celegans] 2.69 

318208 AI091458 Hs.134559 ESTs 2.68 

331623 R38715 Hs.153529 Homo sapiens clone 24540 mRNA sequence 2.68 

40 324616 A1823999 Hs.162000 ESTs 2.68 . 

304968 AA614308 EST singleton (not in UniGene) with exon hit 2.67 

314912 AI431345 Hs.161784 ESTs 2.67 

300767 AW193466 Hs.136525 ESTs 2.67 

313463 AI057369 Hs.122536 ESTs 2.65 

45 320600 AA135565 Hs.250739 ESTs 2.65 

301180 AI308989 Hs.156939 ESTs 2.65 

324825 AA704457 Hs.255738 ESTs; Moderately similar to gag [Rsapiens] 2.65 

300336 AW292417 Hs.255074 ESTs; Moderately similar to high-risk human papilloma viruses E6 

oncoproteins targeted protein E6TP1 alpha [H.sapiens] 2.64 

50 317850 N29974 EST cluster (not in UniGene) 2.64 

339047 CH22J)A59H18.GENSCAN.28-7 2.64 

324580 AA492588 EST cluster (not In UniGene) 2.63 

321142 AI817933 Hs.209584 ESTs 2.62 

319478 R06841 EST cluster (not In UniGene) 2.62 

55 . 300793 AI248571 Hs.186837 ESTs 2.61 

313733 AA836116 EST cluster (not in UniGene) 2.6 

326505 CH.19_hsgi|5867435 2.6 

3149B7 AW015506 Hs.130730 ESTs 2.6 

303114 AF090948 EST cluster (not in UniGene) with exon hit 2.59 

60 318709 H24244 Hs£40763 ESTs; Weakly similar to /prediction 2.58 

312878 AI209108 Hs.143946 ESTs 2.57 

329224 CHJLhsgl|5868728 2.56 

328018 CH.06_hs gi|5902482 2.56 

323231 AA324437 Hs.177230 ESTs 255 

65 312887 AW157377 Hs.132910 ESTs 2.55 

315183 AW136134 Hs220277 ESTs 255 

300259 AI479011 Hs.1 70783 ESTs 2-54 

313240 AI743261 Hs.131860 ESTs 254 

316697 AW293174 Hs^52627 ESTs 253 
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313966 AI807551 Hs.189061 ESTs 2.53 

331263 AA015718 ze31a12.s1 Soares retina ffcMHR Homo sapiens cDNA clone 

IMAGE365743\mRM sequence . 2^1 

310683 AW055233 Hs.160870 ESTs 2.5 

5 302566 AA085936 Hs348572 Human PAC clone DJ404F 18 from Xq23 25 

302697 AJ001408 EST cluster (not in UniGene) with exon hit 25 

308362 AI613519 EST singleton (not in UniGene) with exon hit 2.49 

322347 AF086538 EST cluster {not in UniGene) 2.49 

316240 AA974253 Hs.120319 ESTs 2.49 

10 323208 AA203415 Hs.136200 ESTs 2.48 

321643 W76005 Hs32094 ESTs 2.48 

330723 AA243617 Hs.31082 ESTs; Highly similar to do83 (Rnorvegicus] 2.48 

323455 AA256675 Hs.200438 ESTs; Weakly similar to atypical PKC spedftc binding protBln [Rnorvegicus] 2.47 

308383 A1624497 EST singleton (not in UniGene) with exon hit 2.47 

15 328744 CH.07_hsgi|5868290 2.47 

332344 W45574 Hs.252497 ESTs 2.47 

328121 CH.06Jlsgi|5868031 2.47 

321915 AI670955 Hs£00151 ESTs 2.46 

314954 AA521381 Hs.187726 ESTs 2.45 

20 302821 AA188868 Hs.173933 ESTs; Weakly similar to NUCLEAR FACTOR 1/X [H^apiens) 2.45 

329454 CH.Y_hsgi|5868887 2.45 

336605 CH22_FGENES.420_4 2.45 

300664 AI444628 Hs^56809 ESTs 2.44 

323362 AL135067 Hs.1 17182 ESTs 2.44 

25 300024 M10098 AFFX control: 18S ribosomal RNA 2.44 

325026 AI671168 Hs.12285 ESTs 2.43 

324510 AI148353 Hs.120849 ESTs 2.43 

313389 AI765182 Hs.1 19903 ESTs 2.43 

301309 M78276 Hs.255917 ESTs 2.43 

30 313570 AA041455 Hs^09312 ESTs 2.43 

316504 AW135854 Hs.132458 ESTs 2.42 

319401 R01342 EST cluster (not in UniGene) • 2.42 

312827 At744361 Hs.205591 ESTs; Weakly similar to zinc finger protein Png-1 [M.musculusl 2.42 

327871 CH.06 hsgi|5868131 2.41 

35 337173 CH22JSENES.5653 2.41 

302948 AA465635 EST cluster (not in UniGene) with exon hit 2.41 

324303 AU 18754 EST cluster (not In UniGene) 2.4 

315527 AI791138 Hs.1 16768 ESTs 2.4 

315979 AA830515 Hs222917 ESTs 2.4 

40 331310 AA253351 Hs.44439 STAT Induced STAT bihibitor-4 2.4 

321095 AA017595 Hs.32844 ESTs 2.4 

308561 AI701559 EST singleton (not in UniGene) with exon hit 2.39 

313035 N36417 Hs.144928 ESTs 2.37 

322114 AA643791 Hs.191740 ESTs 2.37 

45 313671 W49823 Hs.145553 ESTs 2.37 

303211 AA099548 Hs.191436 ESTs;Hig^rysinutertodJ1l18D24>4[H.sapiens] 2.37 

301256 AA932948 EST cluster (not in UniGene) with exon hit 2.36 

. 338165 CH22 EM:AC005500.GENSCAN.212-3 2.36 

324692 AA557952 EST duster (not in UniGene) 235 

50 318587 AA779704 Hs.168830 ESTs 2.35 

312378 R41582 Hs.109219 retinal degeneration B beta 2.35 

318625 T48446 Hs.193162 ESTs 2.35 

305181 AA663726 Hs.1 16922 EST 2.35 

300815 AA286678 EST duster (not in UniGene) with exon hit 2.34 

55 324063 AW292740 Hs.254815 ESTs 234 

315859 AA682305 Hs.133268 ESTs 233 

305092 AA642912 EST singleton (not in UniGene) with exon hit 233 

306598 AI000320 EST singleton (not in UniGene) with exon hit 233 

300307 AI651016 Hs^46311 ESTs 233 

60 321348 Z49979 EST cluster (not In UniGene) 2.33 

325112 AI90377O Hs.124344 ESTs 232 

336679 CH22.FGENES.43-7 232 

321383 AJ0Q2574 EST cluster (not in UniGene) 232 

337357 C822 FGENES.73<« 231 

65 300680 AW468066 Hs.257712 ESTs; WeaWy similar to K1AA0986 protein (H^apiens) 231 

327120 CR21 hsg^6531970 2.31 

302761 AW250553 EST duster (not In UniGene) with exon hit 23 

312132 AI475490 Hs.170577 ESTs 2.3 

315639 AA827652 EST duster (not in UniGene) 23 
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312189 T95594 Hs.1 87435 ESTs 2.3 

306537 AA991705 EST singleton (not in UniGene) with exon hit 2.3 

327061 CH21Jisgi|6531965 2.3 

315391 AA759098 Hs.192007 ESTs 2.3 

322384 AI9S8646 Hs.33862 ESTs 229 

323206 AA203339 HS220750 ESTs 2.29 

318110 AI680915 Hs201379 ESTs 228 

335250 CH22_FGENES.516J1 228 

331696 238907 Hs.91662 KIAA0888 protein 228 

318327 AW294013 Hs200942 ESTs 228 

324980 AA969121 Hs254296 ESTs 228 

319429 AI608881 Hs.11482 ESTs; Highly similar to junctional adhesion molecule [H.sapiens] 2.28 

310601 AI970543 Hs.192605 ESTs 228 

318S05 Z43395 EST cluster (not in UniGene) 228 

323442 AA252753 Hs.164039 ESTs 227 

304428 AA342250 Hs.99819 ubiquitin specific protease 16 227 

313352 AW292127 Hs.144758 ESTs 227 

316491 AA766025 Hs238794 EST 227 

317751 AI697668 Hs2Q2241 ESTs 226 

314136 AA229781 Hs221962 ESTs 226 

306655 AI004614 Hs.130577 EST 226 

303946 AW474196 HS221604 ESTs 225 

313435 AA769123 EST cluster (not in UniGene) 225 

317679 AA968799 Hs. 150289. ESTs 225 

322370 AA330095 EST cluster (not in UniGene) 225 

306620 AI000929 EST singleton (not in UniGene) with exon hit 224 

329109 CH.XJisgil5868626 224 

311043 AI871209 Hs.177128 ESTs 224 

300228 AI458372 Hs.158748 ESTs; Weakly similar to synapsin lb [M.musculus] 224 

307223 AJ193698 Hs.184776 ribosomal protein 123a 224 

309023 AI888045 EST singleton (not in UniGene) with exon hit 223 

310749 AI493675 Hs.170332 ESTs 223 

316769 AI914939 Hs2121B4 ESTs 222 

320409 AA356195 EST cluster (not in UniGene) 221 

333149 CH22_FGENES.87_8 221 

324951 M86125 Hs.137487 ESTs 221 

321939 AI791617 Hs.145068 ESTs 22 

320594 AI863952 Hs.169436 arginyttranslerase 1 22 

320722 R67430 Hs.172787 ESTs 22 

321781 D78667 EST cluster (not in UniGene) 22 

328903 CH.08_hsgi|5868514 22 

303889 T19204 EST cluster (not in UniGene) with exon hit 22 

325045 T08845 EST cluster (not in UniGene) 22 

312828 A1865455 Hs211818 ESTs; Moderately similar to 111! ALU SUBFAMILY J WARNING ENTRY OH [H.sapiens] 2.19 

335109 CH2a_FGENES.494_15 2.18 

330878 AA131471 Hs.71440 ESTs 2.18 

311289 AI971362 HS231945 ESTs 2.18 

304608 AA513456 EST singleton (not In UniGene) with exon hit 2.18 

337393 CH22_FGENES747-4 2.18 

332812 CH22_FGENES.7_14 2.18 

327665 CH.04_hsgip867839 2.18 

314581 AW504859 HS237849 ESTs 2.17 

326508 CH.19_hsgi|6682496 2.17 

301242 AW161535 Hs258803 ESTs 2.17 

312780 AI765651 Hs.172900 ESTs 2.17 

315954 AW276810 Hs254859 ESTs 2.16 

311179 AI880843 Hs223333 ESTs 2.16 

315320 AI084182 Hs.186895 ESTs 2.16 

313017 AI015203 Hs.1 18015 ESTs 2.16 

312430 AW139117 Hs.1 17494 ESTs 2.15 

300864 AA406539 Hs.190958 ESTs 2.15 

314753 AA463262 EST cluster (not in UniGene) 2.15 

322574 AF156548 EST cluster (not In UniGene) 2.15 

321409 C03864 EST cluster (not In UniGene) 2.15 

321205 AA002047 EST cluster (not in UniGene). 2.14 

320406 AA353895 H&152983 HUS1 (S. pombe) checkpoint homolog 2.14 

337646 CH22_ErVUVC000097.GENSCAN.11-2 2.13 

303084 AF1 74008 EST cluster (not in UniGene) with exon hit 2.13 

312185 AA654772 Hs.186564 ESTs 2.13 
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305813 
314465 
318168 
315990 
320712 
318487 
317462 
304384 
314544 
319881 
328078 
317354 
308617 
311568 
313605 
314289 



313659 
324596 
324783 
302696 
313418 
326920 
327574 
323207 
303753 
305235 
316055 
317194 
319565 
335146 
301475 
312442 
322502 
303693 
310179 
321121 
331330 
306557 
317865 
318667 
318042 
323818 
331286 
311262 
335601 
311351 
312996 
328190 
338030 
333940 
328227 
331481 
335288 
307513 
323316 
319479 
303482 
327489 
323935 
309575 
337043 
312897 
307881 
328656 
314569 
332783 
315259 



Ai 066544 

AA602917 

AI821782 

AI800041 

R66867 

Al 167877 

AW015206 

AA235482 

AA399018 

T72744 

AW090770 

AI738720 

AW439969 

AI761786 

AA848118 



AW296067 
AW149321 
AA640770 
AA347452 
AW450674 



AI052795 

AW503733 

AA670480 

AA693880 

AW445167 

AW408683 

AI678183 

AA120970 

R 62925 

AA290875 

AI215643 

W23285 

AA282197 

AA994530 

AI298794 

AI493742 

AW294522 

AW245528 

AA137062 



AI682303 
AA249018 



N27448 

A1274307 
AL134620 
R21945 
AA502583 

AW175841 
AW168096 

A1828174 
AB70434 

AA813784 

W45302 

AA701499 



Hs.156974 
HS220587 
Hs.190555 

Hs.143716 
Hs.178784 
Hs.62954 
Hs.250835 



Hs.192271 

Hs.218177 
Hs.204674 
Hs.221216 



Hs.124106 
Hs.105411 



Hs.1 14696 



Hs.192201 
Hs.170315 



Hs.126036 
Hs.32922 

Hs.170917 

Hs.143199 

Hs.243665 

Hs.30120 

Hs.171381 

Hs.89002 

Hs.129130 
Hs.165210 
Hs.149991 
Hs.134754 
Hs.103653 
HS232150 

Hs.201274 



Hs.43944 



Hs.256153 
Hs.197271 

Hs.192183 
Hs.195188 

Hs.227049 



Hs.123001 
Hs.87889 
Hs.148115 



EST singleton (not in UniGene) with exon hit 2.1 3 

ESTs 212 
ESTs; Moderately similar to I!!! ALU SUBFAMILY SC WARNING ENTRY l!U [Usapiens] 

ESTs 2.11 

EST cluster (not in UniGene) 2.1 1 

ESTs 2.11 

ESTs 2.11 

ferritin; heavy polypeptide 1 2.1 1 

ESTs 2.1 

EST cluster (not in UniGene) 2.1 

CH.06J>sgi]58680Q8 2.1 

ESTs 2.1 

EST singleton (not in UniGene) with exon hit 2.09 

ESTs 2.09 

ESTs 2.09 

ESTs 2.08 

CH22_FGENES.38_7 £08 

CH.12Jsgi|5866967 2.08 

ESTs - 2.08 

ESTs 2.08 

EST cluster (not in UniGene) 2.07 

EST cluster (not in UniGene) with exon hit 2.07 

ESTs 2.06 

CR21Jsgi|6456782 2.06 

CH.03Jsgi|5867818 2.06 

ESTs 2.06 

ESTs 2.05 

EST singleton (not in UniGene) with exon hit 2.05 

EST cluster (not in UniGene) 2.05 

ESTs 2.05 

ESTs 2.05 

CH22_FGENES.499_2 2.05 

prostaglandin E receptor 3 (subtype EP3) 2.04 

ESTs 2.04 

ESTs 2.04 

ESTs 2.04 

ESTs 2.03 

EST cluster (not in UniGene) 2.03 

ESTs; Highly similar to CG I-07 protein [H.sapiens] 2.03 

EST singleton (not in UniGene) with exon hit 2.03 

ESTs 2.03 

ESTs 2.02 

ESTs 2.02 

ESTs 2.02 

ESTs 2.01 

ESTs 2.01 

CH22_FGENES.581.41 2J>1 

ESTs 2.01 

EST cluster (not in UniGene) 2.01 

CH.06 hsgi|5868077 2 

CH22 EMAC005500.GENSCAN.148-16 2 

CH22_FGENES.301_6 2 

CH.06 hsgi[5868105 " 2 

EST 2 

CH22_FGENES.527J 2 

EST singleton (not in UniGene) with exon hit 2 

EST duster (not in UniGene) 2 

ESTs 2 

ESTs 2 

CH.02 hsgi|6004459 1.99 

ESTs 1-99 

glyceratdehyde-3-phosphate dehydrogenase 1.99 

CH22 FGENES.439-19 1-98 

ESTs 193 

EST singleton (not in UniGene) with exon hit 1 .98 

CH.07 hsgi(6004473 1-98 

ESTs" 1-98 

helicase-moi 1-98 

ESTs 198 



2.12 
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313171 N67879 Hs.1 57695 ESTs 1* 7 

318060 AI241421 Hs.132236 ESTs 1*7 

332256 N66393 Hs.102754 ESTs 1*7 

312110 AI962180 Hs.226803 ESTs 1*7 

335864 CH22_FGENES.629_9 1*7 

320389 W00545 Hs.171785 ESTs 1.97 

314065 AA868267 Hs.85524 ESTs 1-96 

323086 H15474 Hs.12214 Homo sapiens clone 23716 mRNA sequence 1.96 

323919 AA862973 Hs.220704 ESTs 1-96 

310750 A1373163 Hs.170333 ESTs 1-96. 

309435 AW090537 EST singleton (not in UniGene) with exon hit 1.96 

300129 AW028820 EST cluster (not in UniGene) with exon hit 1.96 

320130 AI820675 Hs203804 ESTs 1.95 

323787 AW373446 Hs.169885 ESTs; Weakly similar to cDNA EST EMBLTQ2216 comes from this gene [C.elegans) 1.95 

338112 CH22_EMAC005500.GENSCAN.185-24 1.95 

313625 AW468402 Hs.254020 ESTs 1-95 

325240 CH.10_hsgi|5866848 1-95 

331833 AA412102 Hs.250911 interteukin 13 receptor; alpha 1 1.95 

332252 N63882 za21f9.s1 Soares fetal Over spleen 1NFLS Homo sapiens cONA clone 

IMAGE293225 3", mRNA sequence 1 .95 

300279 AW237425 Hs.253817 ESTs 1.95 

326023 CH.17J1S gi|5867245 1 -95 

321609 H86021 Hs.1 98800 ESTs; Weakly similar to hMmTRAlb [H.sapiens) 1.94 

324183 AA402453 Hs.1 13011 ESTs 1*4 

336276 " CH22 FGENES.762_5 1.94 

334913 CH22_FGENES.456J3 1.94 

325417 CH.12_hs gi|5866925 1 -94 

318489 AW043590 Hs.225023 ESTs 1.94 

318455 AI148763 EST cluster (not in UniGene) 1*4 

306890 AI092235 EST singleton (not in UniGene) with exon hit 1*4 

315073 AW452948 Hs.257631 ESTs 1.94 

321289 R84687 Hs.226306 ESTs 194 

308521 AI689808 EST singleton (not in UniGene) with exon hit 1-93 

306382 AA968967 EST singleton (not in UniGene) with exon hit 1.93 

331320 AA262999 Hs.42788 ESTs 1*3 

324279 AA501412 Hs.191688 ESTs; Weakly similar to Pro-PoWUTPase pofyprotein [M.musculus] 1.93 

309577 AW1 68753 EST singleton (not in UniGene) with exon hit 1.93 

327014 CH.21 Jis gi|5867664 1 *3 

303488 AW025860 EST cluster (not In UniGene) with exon hit 1.93 

306561 AA995223 Hs.129559 EST 1*2 

330694 AA019806 Hs.1 08447 spinocerebellar ataxia 7 (olivop<)ntocerebenaratrcfhywn1ireMdegenemtk)n) 1.92 

313083 N50545 Hs.159200 ESTs 192 

327752 CH.05_hsgi|5867949 1.92 

318674 AA295490 EST cluster (not in UniGene) 1*2 

301267 AW297762 Hs.255690 ESTs 1*1 

332092 AA608787 Hs.1 12590 ESTs 1*1 

323509 AL036947 EST cluster (not In UniGene) 1*1 

321452 AA317554 EST cluster (not in UniGene) 1*1 

311483 AI765013 Hs209128 ESTs 1*1 

300976 AI246374 Hs.185861 ESTs 1*1 

323715 AA322155 EST cluster (not In UniGene) 1*1 

313800 AW296132 Hs.166674 ESTs 1*1 

332029 AA489697 Hs.1 45053 ESTs I* 1 

304013 AW518573 Hs.156110 ImmunoglobuBn kappa variable 1D-8 1*1 

322019 AA354549 Hs.41181 Homo sapiens mRNA; cDNA DKFZp727C191 (from clone DKFZp727C191) 1.91 

334150 CH22_FGENES.339_1 1* 

310094 AW450967 Hs.235240 ESTs 1* 

316218 AW207642 Hs.174021 ESTs 1* 

324774 AIQ31771 Hs.132586 ESTs 1* 

326507 CK19_hsgi|5867435 1* 

314570 AA405696 EST cluster (not in UniGene) 1* 

336268 CH22 FGENES.758.2 1* 

315278 AI985544 Hs.1 16429 ESTs 1* 

325824 CH.15JK gi|5867048 1 * 

316277 AA737780 Hs2 13392 ESTs 1* 

323181 AA418583 Hs.143621 ESTs 1* 

301438 AA961643 Hs.127716 ESTs 1*9 

307050 AI147341 Hs.146734 EST 1*9 

306830 A1075803 EST singleton (not in UniGene) with exon hit 1.89 
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302426 AUW9925 Hsl2259B4 DKFZP547GG910 protein 1.89 

320127 H72615 Hs.17268 ESTs 1.89 

337736 CH22_EM^C000097.GENSCAN.100-2 1.89 

331319 AA262755 Hs.194264 ESTs 1.88 

5 310767 AI377505 Hs.158835 ESTs 1.88 

314880 AI732169 Hs.105429 ESTs 1.88 

312539 AI004377 Hs.200360 ESTs 1.88 

309674 AW205604 Hs.168034 ESTs; Weakly similar to !!!! ALU SUBFAM1Y SP WARNING ENTRY M! [Ksapiens] 1.88 

314621 AI627478 Hs.187670 ESTs 1.88 

10 319495 AI972146 Hs.192756 ESTs 1.88 

313472 AA007374 EST duster (not in UniGene) 1.88 

302705 U09060 EST duster (not in UniGene) with exon hit 1.88 

32951 1 CH.10_p2 0(3983514 1 .88 

317140 AI699412 Hs.201925 ESTs 1.87 

1 5 302598 A1815985 Hs.129683 ubiquitin-conjugating enzyme E2D 1 (homologous to yeast UBC4/5) 1 .87 

301 153 AA725670 Hs.120485 ESTs; Weakly similar to serine/threonine kinase w'rth SH3 domain; leudne 

zipper domain and proline rich domain [lisapiens] 1 .87 

332222 N28271 Hs.176618 ESTs 1.87 

330703 AA055475 Hs.1 04143 dathrin; light polypeptide (Lea) 1.87 

20 318470 A1159863 Hs.143713 ESTs 187 

314014 AW291847 Hs.121715 ESTs; Weakly similar to HP protein [Ksapiens] 1.87 

300370 AI827817 EST duster (not in UniGene) with exon hit 1.86 

312329 R84768 Hs.13399 Homo sapiens done 25032 mRNA sequence 1.86 

325587 CH.12J1S gi|6682462 1.86 

25 310237 AIB84313 Hs.158906 ESTs 1.86 

318872 R13085 EST duster (not in UniGene) 1.86 

303431 AA317915 EST duster (not in UniGene) with exon hit 1.86 

338427 CH22_EM:AC005500.GENSCAN.349-1 1.86 

300452 A1352293 Hs.191098 ESTs 1.85 

30 321279 H85330 Hs.146060 ESTs 1.85 

301690 F05865 Hs.249180 ubiquitinKX)niugatingenzynTeE2E2(horTO!ogoustoyeastUBC4/5) 1.85 

307932 AJ230822 EST singleton (not in UniGene) with exon hit 1.85 

318292 AI679966 Hs.150603 ESTs 1.85 

310254 AI239811 Hs.157491 ESTs 1.85 

35 311790 AW016437 Hs.233462 ESTs 1.84 

314248 AA278347 Hs.126078 ESTs 1.84 

335586 CH22_FGENES.581_25 1.84 

339209 CH22_FF113D11.GENSCAN.64 1.84 

307954 AI419692 EST singleton (not in UniGene) with exon hit 1.84 

40 302549 AF055136 Hs£48162 tedorin alpha 1.84 

321629 H87213 Hs.158092 ESTs 184 

301239 AA807558 EST duster (not in UniGene) with exon hit 1.84 

332434 N75542 Hs.75356 transcription factor 4 1.84 

327192 CH.01_hsgi|5867445 1.83 

45 310214 AJ220072 Hs.165893 ESTs 183 

320516 R33857 Hs.181479 ESTs; Weakly similar to E-SELECTIN PRECURSOR [H.sapiensJ 1.83 

324231 W60827 EST duster (not in UniGene) 1.83 

336616 CH22_FGENES.613_5 183 

328799 CH.07_hsgi|5868316 1.83 

50 324661 AW504161 EST duster (not In UniGene) 1.83 

313190 AA766707 Hs.153039 ESTs 1.83 

301979 L28168 Hs.12 1495 potassium voftage^ated cnannel; Isk-related tarni^ member 1 1.82 

302099 AL021397 Hs.1 37576 ribosomal protein L34 pseudogene 1 1-82 

320187 T99949 EST duster (not in UniGene) 1.82 

55 320791 R78808 Hs.93961 ESTs; Weakly similar to UH ALU CLASS A WARNING ENTRY III! (H^apiens] 1.82 

305733 AA829535 Hs.84298 CD74 antigen (invariant polypept of MHC; dass II antigen-associated) 1.82 

308280 AI569349 Hs.180920 ribosomal protein S9 1.81 

321533 W78877 Hs.4011l ESTs 181 

312946 AI915122 Hs£04087 ESTs; Weakly similar to F33D 11.9b [Cetegans] 1.81 

60 319474 H90265 Hs.100636 ESTs 181 

329519 CH.10j>2 gi)3983510 1 81 

324685 AA220982 EST duster (not in UniGene) 1.81 

320697 N62937 Hs.139181 ESTs 181 

329246 CRXJw gi|5868732 1 -81 

65 332000 AA481271 Hs.193945 ESTs 181 

310811 AI420990 Hs.161303 ESTs 181 

325856 CH.16_hsgi|5867076 1.81 

322064 Z78343 EST duster (not in UniGene) 1.8 

333712 CH22 FGENES-251J 1.8 
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313457 AA576052 Hs.193223 ESTs ™ 

321591 H85687 Hs.1 17927 ESTs 1-8 

330260 CH.05j>2gij6671884 1.8 

311080 AI656320 Hs.197711 ESTs 1.8 

5 329522 CH.10j)2gi|3983507 1.8 

322889 AA081924 Hs.211417 ESTs 18 

300175 AI275011 Hs.204877 ESTs 1.8 

330976 H20560 Hs.244624 ESTs 1-8 
300208 A1341180 Hs.196115 ESTs; Weakly simHar to FIBRILLIN 1 PRECURSOR [H.sapiens] 1.79 

10 319635 R17531 EST cluster (not in UniGene) 1.79 

313454 AA730673 Hs.188634 ESTs .179 

303093 AI400310 Hs.148958 ESTs 1.79 

309815 AW292760 EST singleton (not in UniGene) with exon hit 1.79 

326506 CH.19_hs gIJ5867435 1 .79 

15 319845 AA649011 Hs.187902 ESTs 1.79 

300290 AI623739 Hs.186387 ESTs 179 

312180 AI248285 Hs.118348 ESTs 1.79 

313058 D81015 Hs.125382 ESTs 179 

330120 CH.19j)2gi|6671864 1.78 

20 328412 CH.07_hsgi|5868405 1-78 

302345 NM_000565 EST cluster (not in UniGene) with exon hit 1.78 

308100 A1475949 EST singleton (not in UniGene) with exon hit 1.78 

311386 AW205705 Hs.207514 ESTs 1.78 

330282 CH.05j>2gi|6671910 1.78 

25 318856 Z43011 Hs.21169 ESTs 1.78 

312486 AA845630 Hs.1 17904 ESTs 1.78 

325450 CH.12_hs gi|5866941 1 .78 

321206 H54178 Hs.226469 ESTs 178 

330977 H20826 Hs.31783 ESTs 1.78 
30 303487 AA333666 EST cluster (not in UniGene) with exon hit 1.77 

310398 AI264671 Hs.164166 ESTs 1.77 

313230 AI540166 Hs.129563 ESTs 1.77 

317747 AI683782 Hs.128245 ESTs 1.77 

303381 AL038841 Hs.1 63313 ESTs; Weakly similar to !!!! ALU SUBFAMILY SB WARNING ENTRY !!!! [H^apiens] 1.77 

35 336123 CH22_FGENES.7Q1_8 177 

300185 AI286182 Hs208484 ESTs 1.77 

316002 AW451733 Hs.1 19824 ESTs 1.77 

319850 AA001811 Hs.83722 ESTs 1-77 

329941 CH.16j>2gi[6165199 1.77 

40 328329 CH.07Jisgi|5868375 1.77 

322934 A1493054 Hs.158968 ESTs 177 

325902 CH.16_hsgi|5867101 1.76 

322239 W01813 Hs.12109 WD40 protein Ciaol 1.76 

303530 AI274851 Hs.258744 ESTs 176 

45 300980 AI025527 Hs222097 ESTs 176 

331909 AA437300 Hs.178210 ESTs 176 

321553 H92449 Hs.1 16406 ESTs 176 

301618 T52760 EST cluster (not in UniGene) with exon hit 1.76 

319592 AA627356 Hs.163315 ESTs 1.76 

50 318511 T26528 Hs.227175 ESTs; Weakly similar to !!!! ALU SUBFAMILY SQ WARNING ENTRY DH (Ksapiens] 1.76 

327183 CH.01_hsgi[5867442 1.76 

313516 AA029058 Hs.135145 ESTs 1.76 

318644 A1752482 EST cluster (not in UniGene) * 1.76 

321632 AA419617 EST cluster (not in UniGene) 1.76 

55 324657 AW451142 Hs.255628 ESTs 1-76 

300437 AW449374 Hs.257149 ESTs 1.75 

319775 AA504429 Hs.6211 methyi-CpG binding domain protein 1 1.75 

314775 AI149880 Hs.188809 ESTs 1.75 

337460 CH22_FGENES.780-5 1.75 

60 309849 AW297444 EST singleton (not in UniGene) with exon hit 1.75 

301471 AA995014 Hs.129544 ESTs; Weakly similar to ORF YLL027w [S.cerevisiaeJ 1.75 

312739 AI318426 Hs.155925 ESTs 1.75 

319995 H15355 Hs.60887 ESTs 175 

326495 CH.19.hs gil5867423 1 .75 

65 337497 CH2^_FGENES£0M 175 

322633 AA004534 Hs.153981 ESTs 175 

332177 F10812 Hs.101433 ESTs 175 

326930 CR21Jisgi]B456782 1 75 

316893 AA837332 EST cluster (not in UniGene) 1.75 
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324826 AA704806 Ks. 143842 ESTs 1.75 

311269 AI656924 Hs.174257 ESTs 1.75 

309375 AW075342 EST singteton (not In UniGene) with exon hil 1.75 

314171 AI821895 Hs.193481 ESTs 175 

311684 A1990741 Hs^52809 ESTs 1.75 

334387 CH22_FGENES.380_1 1.75 

312195 AI300101 Hs.252222 ESTs 175 

315707 AI418055 Hs.161160 ESTs 1.74 

324349 AW501470 EST duster (not In UniGene) 1.74 

300724 AT762929 H&206134 ESTs; Weakly similar to similar to reverse transcriptase (C.elegans) 1.74 

309906 AW339340 EST singleton (not in UniGene) with exon hit 1.74 

303714 AW501336 EST cluster (not in UniGene) with exon hit 1.74 

318704 Z24981 EST cluster (not in UniGene) 1.74 

303027 AF111178 ESTcluster (not in UniGene) with exon hit 1.74 

322601 W92924 EST cluster (not in UniGene) 1.74 

319382 H93199 Hs.33665 ESTs 174 

315858 AA737345 EST cluster (not in UniGene) 1.74 

332243 N55484 Hs£20540 ESTs; Highly similar to ARYL HYDROCARBON RECEPTOR NUCLEAR 

TRANSLOCATOR [risapiens] 1 .74 

330951 H02566 Hs.191268 Homo sapiens mRNA; cDNADKFZp434N174 (from done DKFZp434N174) 1.74 

324044 AL045752 Hs.211519 ESTs 173 

320630 AA199847 EST duster (not in UniGene) 1.73 

327288 CrlOtJisgij5867481 1.73 

314986 AI201367 Hs.142860 ESTs 173 

319078 H17255 Hs.144515 ESTs 173 

326278 CH.17_hsgi|5867269 1.73 

302552 H49792 EST duster (not in UniGene) with exon hit 1.73 

322322 AF086431 EST duster (not in UniGene) 1.73 

327075 CK21_hsgi|6531965 1.73 

317392 AI797588 Hs.145459 ESTs 1-73 

300810 AI076890 Hs.186949 ESTs 1.73 

315978 AA830893 Hs.119769 ESTs 1.73 

323903 AA773580 Hs.193598 ESTs 1-73 

330803 AA004699 Hs.150580 putative translation initiation fador 1.73 

309845 AW296802 Hs£55580 EST 1.73 

314963 AI689617 Hs.200934 ESTs 1.73 

311710 F09774 Hs.175971 ESTs 173 

315315 A1984592 Hs.15088 ESTs 1.73 

300378 AA663560 Hs.235873 ESTs; Weakly similar to K1 1C4.2 (C.etegans) 1.73 

316141 AW303457 EST duster (not in UniGene) 1.72 

319826 T71739 Hs.75442 albumin 1.72 

312961 AI033922 Hs.122517 ESTs 1.72 

334379 CH22_FGENES.379J1 1.72 

305854 AA862733 EST singleton (not in UniGene) with exon hfi 1.72 

313031 N34927 Hs.186566 ESTs 172 

329728 CH.14_p2gi]6065785 1-72 

312090 N57692 Hs.1 18064 ESTs 172 

323341 AL134875 Hs.192386 ESTs 1.72 

302077 AA310580 Hs.132898 Homo sapiens crromosome 11; BACCIT-HSP«311e8(BC269730) 

containing the hFENI gene 1.71 

310766 AI971438 Hs.158824 ESTs 171 

311450 A1809985 HsJ>03340 ESTs 1-71 

311792 AW238064 Hs.253909 ESTs * 1.71 

321500 H71999 EST duster (not in UniGene) 1-71 

311948 T78791 Hs241569 ESTs; Moderately smlr to lltl ALU SUBFAMILY SQ WARNING ENTRY !!!! [Rsaplensj 1.71 

302270 R56151 . EST duster (not in UniGene) with exon hit 1.71 

329089 CHJLnsgil5868614 1.71 

322331 AF086467 EST duster (not in UniGene) 1.71 

318235 A1080361 Hs.134217 ESTs 1.71 

304561 AA489792 EST singleton (not in UniGene) with exon hit 1.71 

312681 AI028149 Hs.193124 pyruvate dehydrogenase kinase; Isoenzyme 3 1.71 

310250 A1476629 Hs.158465 ESTs 1-71 

338178 CH22JEM:ACa>5500.GENSCAN.219-6 1.71 

338910 CH22JW32I10.GENSCAN.11-2 171 

321225 AL080073 Hs^51414 Homo sapiens mRNA; cDNA 0KF2p564B1462 (from clone DKFZp564B1462) 1.7 

322289 AA534550 Hs.539 ribosomal protein S29 1.7 

319802 AI701489 Hs2Q2501 ESTs 17 

314022 AW452420 H&248678 ESTs 1.7 

314937 AA515602 Hs.152330 ESTs 1.7 
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300580 AA761322 Hs.220538 ESTs 1.7 

304398 AA262785 EST singleton (not in UniGene) with exon hit 1.7 

313421 AW339515 Hs.163700 ESTs 1.7 

309763 AW270182 EST singleton (not In UniGene) with exon hit 17 

5 322092 AF085833 EST duster (not in UniGene) 1.7 

315503 AA764768 Hs.121158 ESTs 1.7 

325031 T08597 EST cluster (not in UniGene) 17 

327157 CHJOIJa gi|5B66841 1 .7 

314809 AI741461 Hs.161904 ESTs 1-7 

10 320361 HS7220 Hs.146406 nitrilasel 1 - 69 

324721 AW402302 Hs.43616 ESTs 1-69 

328624 CH.OJJiS gi|5868246 1 -69 

303344 AA255977 Hs^50646 ESTs; Highly similar to ubiquilm-conjugating enzyme [M.musculus] 1-69 

328960 CR08_hsgi[6456775 1.69 

15 315702 AA657501 Hs.146315 ESTs 169 

302385 AJ224172 Hs.204096 BpophilinB (uteroglobin family member); prostatein-ftke 1.68 

319699 R14537 EST cluster (not in UniGene) 1.66 

309506 AW1 37700 EST singleton (not In UniGene) with exon hit 1.68 

330417 D84424 Hs.57697 hyaiuronan synthase 1 1.68 

20 315296 AA876905 Hs.125286 ESTs 1.68 

. 328538 CH.07_hsgi|5868485 1 68 

323923 AA354146 EST duster (not in UniGene) . 1.68 

320303 AL079289 Hs.137154 Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 35971 1 .68 

302967 AI927068 Hs.1 10853 ESTs; Weakly similar to R10D12.12 [Celegans] 1 .68 

25 310695 AI472124 Hs.157757 ESTs 168 

307512 AI273815 Hs.242463 keratin 8 1.68 

338506 CH22.EM.AC005500.GENSCAN.390-10 1.68 

331722 AA195405 Hs.1 10347 Homo sapiens mRNA for alpha integrin binding protein 80; partial 1.68 

301431 R05385 EST duster (not in UniGene) with exon hit 1.68 

30 318853 Z42977 Hs.21062 ESTs 168 

323032 AW244073 Hs.145946 ESTs 1.68 

317538 AW137772 Hs.185980 ESTs 1-68 

325780 CH.14_hsgil6381953 167 

321739 AL080280 EST duster (not in UniGene) 1.67 

35 319808 T58960 EST duster (not in UniGene) 1.67 

313443 AA249037 EST duster (not in UniGene) 1.67 

331366 AA424754 Hs.43149 ESTs 1.67 

316443 AI797592 Hsi07407 ESTs 1-67 

322878 AA081820 EST duster (not in UniGene) 1.67 

40 330320 CH.08j»2gi|5932415 1.67 

329081 CHJLhsgi)5868602 1.67 

334026 CH22_FGENES.318_3 167 

317791 AI801500 Hs.128457 ESTs 167 

3222 35 AF086106 EST duster (not in UniGene) 1.66 

45 331148 R73816 Hs.17385 ESTs 1-66 

325452 CH.12Jisgi|5866941 1.66 

315106 AW452184 Hs.232100 ESTs 1-66 

326014 CH.16_hsgi|5867160 1.66 

307130 AJ185234 EST singleton (not in UniGene) with exon hit 1.66 

50 300943 AA524545 Hs.224630 ESTs 1-66 

319402 W21298 EST duster (not in UniGene) 1.66 

310889 AI457946 Hs.170437 ESTs; Weakly similar to hyperpolarization-activated; cyclic 

nudeotide-gated channel 2 (H^apiens) 1 66 

323371 AL135118 EST duster (not in UniGene) 1.66 

55 335568 CH22 FGENES.581_4 166 

320654 AW263086 Hs.118112 ESTs 166 

338983 CH22_DA59H18.GENSCAN.3-1 1.65 

330002 CR16j>2 gij6623963 1-65 

315343 AW205477 Hs.179891 ESTs 1-65 

60 334487 CH22 FGENES.395_9 165 

312169 AI064824 Hs.193385 ESTs 1.65 

309668 AW204480 Hs253414 EST 1-65 

309518 AW148928 Hs£48895 EST 1-65 

307965 AI421641 EST singleton (not in UniGene) with exon hit 1.65 

65 316787 AW369770 Hs.130351 ESTs 165 

300835 AA401858 H&224843 ESTs 165 

338763 CH22 EMAC005500.GENSCAN517-16 U65 

303327 AA232729 Hs.154302 ESTs 165 

313231 AW139993 Hs.163682 ESTs 165 
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334073 
319901 
326530 
301126 
314043 
304387 
322932 
337272 
332694 
318996 
315336 
313329 
318088 
313835 
320035 
309372 
324157 
323929 
302490 
333942 



T77136 

AIB02877 
AA827082 
AA236027 
M099732 

AA262768 

Z44266 

AW342028 

AW293704 

AW295409 

AI538438 

AA378974 

AW074330 

AW4Q2236 

AA354940 

AA885502 



3)1918 
315664 
304405 
310624 
319250 
310608 
317348 
306513 
320807 
303710 



304236 
317683 
311960 
312834 
325326 
313663 
327526 
300429 
305169 
316621 



316035 
300492 
316532 
332048 
307113 
319127 
331155 
338220 
315763 
323571 
312240 
304569 
313179 
326858 
317276 
312572 
311932 
302103 
308413 
310077 
337780 
327796 
308352 
324539 
303232 
337884 



AA476777 

AI744068 

AA282572 

AI341594 

F11623 

AI962234 

AI348076 

AA989230 

AA086110 

AI269069 

W93278 
A1791700 
AW440133 
AI028309 

AI953261 

AW449679 
AA663131 
AI021996 

AI744130 

AL031709 

AI307229 

AA496019 

A1183686 

N49476 

R87650 

AW515270 

AA984133 

R26628 

AA490934 

AI076101 

AI823847 

AA350125 

AW451654 

AA452310 

A1636253 

A1620617 



CH22_FGENES.327_28 
Hs.8765 RNA heBcase-r elated protein 

CH.19_hsgi[5867441 
Hs£10843 ESTs;WeaWysirnilafto(U1039K52lH.sapiens] 

EST duster (not in UniGene) 

EST singleton (not in UniGene) with exon hit 

EST cluster (not in UniGene) 

CH22.JGENES.660-1 
Hs.243901 KIAA1067 protein 

EST cluster (not in UniGene) 
Hs.256112 ESTs 
Hs.122658 ESTs 
Hs.137945 ESTs 
Hs.159087 ESTs 



1.65 
1.65 
1.65 
1.65 
1.65 
1.65 
1.65 
1.64 
1.64 
1.64 
1.64 
1.64 
1.64 
1.64 



AJ610791 
A137BQ32 
AA437414 



Hs.130720 ESTs; Weakly similar to CELLULAR NUCLEIC ACID BINDING PROTEIN [Rsapiens] 1.64 

EST singleton (not in UniGene) with exon hit 1 .63 

EST cluster (not in UniGene) 1 .63 

Hs.145958 ESTs 1-63 

Hs.187032 ESTs 1-63 

CH22_FGENES.301_8 163 

CH.02_hsgi|5867772 1-63 

EST cluster (not in UniGene) wifo exon hit 1 .63 

Hs.160712 ESTs 1*3 

. EST singleton (not in UniGene) with exon hit 1 .63 

Hs.157522 ESTs; Moderately similar to env protein [Rsapiens] 1.63 

EST cluster (not in UniGene) 1 .63 

Hs.196102 ESTs 1-63 
Hs.831 3-hydroxymemyt-3-methytglutaryl-Coen2yme A lyase (hydroxymethylglutaricackJuria) 1 .63 

EST singleton (not in UniGene) with exon hit 1 .63 

Hs.188536 Homo sapiens done 24838 mRNA sequence 1.63 

Hs250852 ESTs; Highly similar to ubtqurtxn hydrolyzing enzyme I (Rsapiens) 1 .63 

CH.07,hsgiI5868363 1.63 

EST singleton (not in UniGene) with exon hit 1 .63 

Hs.127893 ESTs 1-63 

Hs.189690 ESTs 1-62 

Hs.1 14246 ESTs 1-62 

CH.11Jsgi|5866875 1-62 

Hs.169813 ESTs 1-62 

CH.02 hsgi|6381882 1-62 

Hs.156739 ESTs; Highly simitar to XG GLYCOPROTEIN PRECURSOR [Rsapiens) 1.62 

EST singleton (not in UniGene) with exon hit 1 .62 

Hs.122138 ESTS 1-62 

CR14j>2gi|6272129 1-62 

Hs.131201 ESTs 1-62 

multiple UniGene matches 1 -62 

Hs.184304 ESTs 1-62 

Hs.201591 ESTs 1- 62 

EST singleton (not in UniGene) with exon hit 1 -62 

EST cluster (not in UniGene) 1 -62 
Hs.33439 ESTs; Weakly similar to !!0 ALU SUBFAMILY J WARNING ENTRY till [H^apiens) 1.61 

CH22 EMAC0G5500.GENSCAN.246-9 161 

Hs.1 18342 ESTs 1-61 

Hs.153260 oCbWnteracting prate's) 1-61 

Hs.203669 ESTs 1-61 

EST singleton (not in UniGene) with exon hit 1 .61 

Hs.131704 ESTs 1-61 

CH.20_hsgi|6552462 1-61 

Hs.129986 ESTs 1-61 

Hs,187499 ESTs 1-61 

HSJ257482 ESTs 1-61 

Hs.26090 ESTs; Weakly simitar to T20B12.1 [C.elegansl 1.61 

Hs.196511 EST 1-61 

Hs.148565 ESTs 1-61 

CH22 EMAC000097.GENSCAN.121-2 1.61 

CH.05>gi|5867982 1 61 

EST singteton (not In UniGene) with exon hit 1.61 

Hs.125892 ESTs 1 *61 

EST duster (not in UniGene) with exon hit 1.61 

CH22_EMAC0055Q0.GENSCAN.54-2 161 
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303620 AA397546 Hs.119151 ESTs 1.61 

303481 AA336839 EST duster (not In UniGene) with exon hit 1.61 

314481 AA548589 Hs.105846 ESTs 1.61 

300327 A1908894 Hs245893 ESTs 1.6 

323473 AA262442 EST duster (not in UniGene) 1.6 

326154 CH.17 hsgi|5867170 1.6 

331920 AA446885 Hs.99087 - ESTs; Moderately similar to ZINC FINGER PROTEIN 141 [Rsapiens] 1.6 

323827 AW406878 EST duster (not in UniGene) 1.6 

322452 W56710 EST duster (not in UniGene) 1.6 

310597 AI739071 Hs.158515 ESTs 1.6 

307871 AI368665 EST singleton (not in UniGene) with exon hi! 1.6 

322215 AF088005 EST duster (not in UniGene) 1.6 

318420 AI139857 Hs.143837 ESTs 1.6 

332217 H98987 Hs.102383 EST 1.6 

324937 M79230 Hs.192398 ESTs 1.6 

320543 AF052176 Hs.158529 Homo sapiens done 24457 mRNA sequence 1.6 

300674 AW467388 EST duster (not in UniGene) with exon hit 1.6 

315193 AI241331 Hs.131765 ESTs 1.6 

319713 R24204 EST duster (not in UniGene) 1.6 

301210 AI379982 Hs.158944 ESTs 1.6 

309365 AW072861 EST singleton (not in UniGene) with exon hit 1.6 

321403 AW451454 Hs.247568 adenylate kinase 3 1.6 

321908 AA376936 Ks.20998 ESTs 1.6 

303349 AA382661 EST duster (not in UniGene) with exon hit 1.6 

324338 AL138357 Hs.247514 ESTs 1.6 

310599 AW300144 EST duster (not in UniGene) 1.6 

333193 CH22J=GENES.98_15 1.6 

336433 CH22_FGENES.825J2 1.6 

312097 AI352096 Hs.157169 ESTs 1.6 

311445 AW204237 Hs.192703 ESTs; Weakly similar to l!H ALU SUBFAMILY J WARNING ENTRY !U! (Ksapiensj 159 

317736 AI361722 Hs.192410 ESTs 159 

308147 AI498991 EST singleton (not in UniGene) with exon hit 139 

313489 AA017492 Hs.135655 ESTs 159 

316289 AA902488 Hs.122952 ESTs 159 

326983 CH.21_hs gi[5867657 1 59 

314781 AW205298 Hs.202372 ESTs 1.59 

328397 CH.07_hsgi|5868397 159 

331970 AA461084 Hs.187677 ESTs 159 

321744 N91419 Hs.12028 ESTs 159 

310509 AI292181 Hs.150036 ESTs 159 

315921 AI147545 Hs.1 14172 ESTs 159 

322049 AI928242 Hs.144383 ESTs 159 

301161 AA731518 EST duster (not in UniGene) with exon hit 159 

300548 AI026836 Hs.114689 ESTs 159 

319142 F07366 EST duster (not in UniGene) 159 

313526 AW152263 Hs.249243 ESTs 159 

305937 AA883238 EST singleton (not in UniGene) with exon hit 1.58 

330123 CH.19_p2giI6671869 158 

327819 CH.05_hsgi|5867968 158 

318250 AI478814 Hs.134603 ESTs 158 

306760 AI034094 Hs.169476 tubulin; alpha; ubiquitous 158 

322358 AA22Q235 Hs.246836 ESTs 158 

317866 AJ690269 Hs201345 ESTs 158 

320725 AA703319 Hs.120967 ESTs 158 

311332 AW292247 Hs.255052 ESTs 158 

334893 CH22_FGENES.452_7 158 

318730 AA398215 EST duster (not in UniGene) 158 

315889 AW271639 Hs221744 ESTs 1.58 

303702 AW500748 Hs524961 ESTs; WeaWy simflar to 73 kDA subunit of deavage and polyadenylation 

spedfidty factor (H^apiens) 1 57 

315086 AJ492660 Hs.170935 ESTs 157 

332514 AA156499 Hs5454 protein kinase; cAMP-dependent; regulatory; type II; alpha 157 

335549 CH22.FGENES576J0 157 

329532 CH.10jj2gi|3983505 157 

323140 AA180467 EST duster (not In UniGene) 1.57 

313166 AI801093 Hs.151500 ESTs 157 

337896 CH22 EM:AC005500.GEN SCAN 56-3 1.57 

330658 AA319514 H&211093 ESTs 157 

324585 AI823969 Hs.132678 ESTs 157 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



317151 


AW298195 


Hs.255735 ESTs 


157 


308818 


A1819700 


HS208231 EST 


157 


326547 




CH.19 hsgi|5867307 


157 


318833 


H06234 


HS24888 ESTs 


157 


320488 


R31386 


EST duster (not in UniGene) 


157 


305929 


AI124514 


EST singleton (not in UniGene) with exon hit 


157 


338083 




CH22_EM:AC005500.GENSCAN.174-1 


157 


316868 


AI660898 


Hs.195602 ESTs 


157 


310937 


AI472880 


Hs.170480 ESTs 


157 


328638 




CH.07 hsgi|6004473 


157 


310074 


AI651039 


Hs.148559 ESTs 


1.56 


327058 




CHJ21 hsgil6531965 


1.56 


320076 


A1653733 


Hs.204079 ESTs 


156 


322345 


AF086529 


EST duster (not in UniGene) 


1.56 


314731 


A1745498 


Hs£04579 ESTs 


156 


318687 


H49619 


Hs.127301 ESTs 


156 


303841 


AI934464 


EST duster (not in UniGene) with exon hit 


156 


302370 


AJ009849 


Hs.199297 Homo sapiens GNAS1 gene encoding NESP55 


156 


322571 


AF156271 


EST duster (not in UniGene) 


1.56 


318050 


AI052093 


Hs.133132 ESTs 


156 


303388 


AL039604 


EST duster (not in UniGene) with exon hit 


156 


323758 


AA833858 


EST duster (not in UniGene) 


156 


328369 




CH.07 hsgij5868388 


1.56 


329415 




CH.Y_hsgi[5868874 


156 


303915 


AW468839 


Hs^57767 EST 


156 


338794 




CH22_EMAG005500.GENSCAN528-1 


1.56 


303074 


AA243481 


Hs.127320 ESTs; Weakly similar to KIAA0346 [H.saplens] 


156 


318807 


F08434 


EST duster (not in UniGene) 


156 


334287 




CH22_FGENES.369_17 


1.56 


311928 


AW024798 


Hs.233374 ESTs 


1.55 


304592 


AA505833 


Hs.162017 EST 


1.55 


300785 


AA682913 


Hs.247179 ESTs; Weakly similar to KIAA0319 [H.saptens] 


1.55 


304921 


AA603092 


EST singleton (not in UniGene) with exon hit 


155 


324605 


AW502851 


Hs.249978 ESTs 


1.55 


324473 


AW501163 


EST duster (not in UniGene) 


1.55 


300566 


H86709 


H&21371 son of sevenless (OrosophDa) homolog 1 


1.55 


314165 


AA761265 


Hs.221281 ESTs 


155 


302868 


AA157392 


EST duster (not in UniGene) with exon hit 


1.55 


314034 


A1299137 


Hs.154214 ESTs 


155 


325389 




CH.12 hsgi|5866921 


155 


331849 


AA417078 


Hs.193767 ESTs 


1.55 


320536 


AA331732 


Hs.137224 ESTs 


1.55 


303347 


AA258033 


EST duster (not in UniGene) with exon hit 


1.55 


315769 


AA744875 


Hs.189413 ESTs 


155 


317031 


AA973297 


Hs.126101 ESTs 


155 


300203 


AI827065 


Hs.224877 ESTs 


155 


304037 


T26438 


EST singleton (not in UniGene) wQh exon hit 


1.55 


322613 


AW1 60507 


EST duster (not in UniGene) 


1.54 


317987 


AW138174 


Hs.130651 ESTs 


154 


322313 


AF086386 


EST duster (not in UniGene) 


154 


323992 


AW411383 


Hs.1 69688 ESTs 


154 


325303 




CH.11J>sgi)5866908 


154 


312701 


AI457663 


Hs.1 28 127 ESTs 


154 


304787 


AA582678 


EST singleton (not in UniGene) with exon hit 


154 


305849 


AA861571 


EST singleton (not in UniGene) with exon hit 


154 


314557 


AA401367 


Hs.128647 ESTs 


154 


316507 


AI381515 


Hs.158381 ESTs 


154 


315023 


AA533505 


Hs.1 85844 ESTs 


154 


314920 


AA513406 


Hs.152307 ESTs 


154 


323097 


Z44354 


Hs.1 80950 guanine nucleotide binding protein (G protein); q polypeptide 


1.54 


325043 


W27919 


Hs.32944 inositol polyphosphate-4-phosphatase; type 1; 107kD 


154 


307892 


AI376086 


Hs.158759 EST 


154 


324573 


AA491600 


Hs.161942 ESTs 


154 


313092 


A1923673 


Hs.212827 ESTs 


1.54 


324696 


AA641092 


Hs^57339 ESTs 


154 


303019 


AF098363 


EST duster (not in UniGene) with exon hit 


154 


317158 


A1459140 


Hs.129109 ESTs 


154 


309536 


AW151933 


EST singleton (not in UniGene) with exon hit 


154 


301568 


AI146423 


Hs.146709 ESTs 


153 



284 



WO 02/30268 



PCT/US0 1/32045 



315574 AA651923 Hs.191850 ESTs 153 

321861 N79341 EST cluster (not in UniGene) 1.53 

310890 AI184510 Hs.143728 ESTs 153 

330036 CH.17_p2gi|6042048 153 

316907 AA843868 Hs.190567 ESTs 153 

312299 AA972712 Hs.174818 ESTs 1.53 

331128 R51361 Hs£3423 ESTs 1.53 

305177 AA663591 EST singteton (not in UniGene) with axon hit 1.53 

337585 CH22 EM:AC000097.GENSCAN.77-1 1.53 

335290 CH22_FGENES527_3 153 

308896 A1858667 EST singteton (not in UniGene) with exon hit 1.53 

307944 A1418246 EST singteton (not in UniGene) with exon hit 1.53 

300867 AW340374 Hs.121033 neuraJ precursor cell expressed; deveJopmenta8y down-regulated 1 1.53 

335320 CH22 FGENES.534.7 153 

329841 CH.14j>2gi[6672062 1 53 

317916 AI565071 Hs.159983 ESTs 1.53 

332901 CH22.FGENES.36_2 153 

305413 AA724659 EST singteton (not in UniGene) with exon hit 153 

316707 AI016387 Hs.184406 ESTs 1.53 

313693 AW469180 Hs.170651 ESTs 153 

316101 AA922236 Hs-221037 ESTs 153 

320796 AF03B966 Hs.184543 secretory carrier membrane protein 1 153 

307451 A1248615 EST singleton (nol in UniGene) with exon hit 153 

323648 AI679968 Hs. 152060 ESTs 153 

331482 N27515 Hs.40296 ESTs 153 

318059 AI023175 Hs.167022 ESTs 153 

325958 CR16_hsgi|5867142 153 

315736 AA664265 Hs.230213 ESTs 153 

314740 AW015667 Hs.119427 ESTs 152 

314117 AA224368 Hs.185164 ESTs 152 

301646 AA313954 EST duster (not in UniGene) with exon hit 152 

338752 CH22_EM:AC005500.GENSCAN513-10 152 

309314 AW009312 EST singteton (not in UniGene) with exon hit 152 

301445 AI208364 Hs.128233 ESTs; Weakly similar to REGULATOR OF CHROMOSOME 

CONDENSATION (H^apiens] 1.52 

308501 AJ685263 Hs.201150 EST 152 

312330 AA635305 Hs.121574 ESTs 152 

318040 AI018150 Hs.148781 ESTs 152 

336205 CH22.FGENES.719 10 152 

325701 CH.14_hsgi|5867028 152 

315009 AW189460 Hs.208358 ESTs 152 
303121 AW407585 Hs.27769 ESTs; Weakly similar to mCAC [M.musculus] 152 
309271 AI986221 EST singteton (not in UniGene) with exon hit 152 
328385 CH.07_hsglp868395 152 
307700 AI316545 EST singteton (not in UniGene) with exon hit 152 
314591 AW103292 Hs.245328 ESTs 152 
304484 AA432067 Hs.258373 ESTs 152 
304382 AA232873 EST singteton (not In UniGene) with exon hit 152 
304232 W52674 EST singteton (not in UniGene) with exon hit 152 
309853 AW298169 Hs57553 touste<Hike kinase 2 152 
312504 AW207346 Hs.143202 ESTs 152 
313134 N63406 Hs.258697 ESTs 152 
330391 AF0 15950 Hs. 11 5256 tetomerase reverse transcriptase 152 
314342 AI873046 Hs258775 ESTs 151 
305977 AA8B7293 EST singteton (not in UniGene) with exon hit 151 
301165 N85789 Hs.224155 ESTs; Weakly similar to PTERIN-4-ALPHA-CARBINOLAM1NE 

DEHYDRATASE [H.sapiensJ 1.51 

300613 AI932294 Hs.249604 ESTs; Weakly similar to B-CELL LYMPHOMA 6 PROTON {H.sapiens] 1.51 

324124 AI554212 Hs.185664 ESTs; Weakly similar to SERINE/THREONINE-PROTEIN KINASE NRK2 [H^apiens] 151 

308037 AI458207 Hs.174181 ESTs 151 

323909 AL043148 Hs.186257 ESTs 151 

315464 AW139500 Hs.116135 ESTs 151 

306700 A1Q22056 EST singteton (not in UniGene) with exon hit 151 

337976 CH22LEM:AC005500.GENSCAN.107-1 151 

306855 AI083982 EST singteton (not in UniGene) with exon hit 151 

311045 AI569399 Hs.174746 ESTs 151 

315010 AA531082 Hs£40049 ESTs 151 
310205 AW025248 Hs.202445 ESTs 151 
310759 AW135924 Hs^24883 ESTs. 151 
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310354 


AW449044 


Hs.171298 ESTs 


1.51 


312019 


T77046 


Hs.188750 ESTs 


1.51 


334773 




CH22LFGENES.430 5 


1-51 


332043 


AA490831 


Hs.125058 ESTS 


1.51 


322950 


AA296219 


EST duster (not in UniGene) 


1j51 


337920 




CH22_EMAC005500.GENSCAN.67-3 


1.51 


328993 




CH.09_hsgi[5868536 


1.51 


309245 


AI972447 


EST singleton (not in UniGene) with exon hit 


1.51 


312172 


A1222168 


Hs.191168 ESTs 


151 


304039 


T47349 


EST singleton (not in UniGene) with exon hit 


1.5 


301329 


AI149653 


Hs.190496 ESTs 


1.5 


313376 


AI949246 


Hs*00381 ESTs 


1.5 


324248 


AW504918 


EST duster (not in UniGene) 


\J5 


308771 


AI809301 


EST singleton (not in UniGene) with exon hit 


1.5 


334935 




CH22_FGENES.464_3 


1.5 


319764 


AA019827 


EST duster (not in UniGene) 


1.5 


318519 


T27135 


EST cluster (not in UniGene) 


1.5 


332807 




CH22_FGENES.7_9 


1.5 


322310 


AF086376 


EST duster (not in UniGene) 


1.5 


324557 


AA489166 


Hs.156933 ESTs 


1.5 


332118 


AA609585 


Hs.162689 EST 


1.5 


319539 


R09Q27 


EST duster (not in UniGene) 


1.5 


313149 


AW291092 


Hs^01G58 ESTs 


1.5 


329722 




CH.14_p2gi[6065785 


1.5 


323514 


AA861209 


EST duster (not in UniGene) 


1.5 


308078 


AI472621 . 


EST singleton (not in UniGene) with exon hit 


15 


337965 




CH22_EM:AC005500.GENSCAN.100-1 0 


1.5 


335905 




CH22_FGENES.635J3 


1.5 



286 



WO 02/30268 



PCTAJS01/32045 



TABLE 14A shows the accession numbers for those primekeys lacking unigeneflD's for 
Table 14. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



322064 234514J 
321409 197898J 

322092 4S578J 
321452 212379_2 
313603 199797J 
320856 36098.1 



322139 
321500 
313733 
322215 
322235 
321632 
313833 
322310 
322313 
322322 
322331 
.322345 
322347 
322370 
321739 
321781 
314570 
300129 
322452 
321861 
323140 
322520 
321914 
322571 
322574 
314753 
300370 



46806J 
552826 1 
441212J 
4700^1 
47070J 
286374J 
120893J 
47376J 
47386J 
47434.1 
47467 1 
47537.1 
47545.1 
187612J 
43998.1 
1511778 1 
280469.1 
635249 1 
497108.2 
1651920.1 
159551.1 
38916.1 
85114.1 
22297.1 
39412.1 
311451J 
.3910.2 



322601 577912.1 
322613 34330.1 



316055 409389.1 
323316 981458.1 
300492 25768.1 



BE261 397 Z78343 BE176419 AA383657 N90640 AA334052 AW955761 BE536232 AA374087 AA584776 

N71838 AA282003 T54072 AA761419 H92966 A1831371 A1095435 AI690247 R99331 AW9641 10 AA975590 AA346128 

H94196C03864 

AF085833 R69689 AW341677 AA923375 BE327566 AW630415 R69601 AW615339 
AW962489 H64300 AA329527 
AA284333 AW4681 19 AA284334 AA810992 

AB040928 T94673 AI289313 AI536Q39 Z44366 BE141499 D601 16 D61488 D59945 AM19503 R28090 R72986 H03255 
AI1891 12 AI912312 AW51 1018 AI401349 AW470144 C14624 AI335797 Z4O300 AI014456 D60269 D601 15 T16722 AI370673 
D60270 

H53744AF075088 H53797 
BE004271 A1248023 A10221 57 H71999 
AA766346 AA809877 AA6361 16 AW469598 AW977404 
AF08B005 N51816N51731 

AF0861 06 Al 193589 AW665594 N71795 AA722627 AW665373 A1300251 

AW812795 AA419617 H87827 AW299775 AW382168 AW382133 BE171659 AW392392 BE171641 AA541393 

AA766825 AA81 1 180 AA085906 AI762946 AW977820 

AF086376 W77804 W72689 AA837735 

AF0863B6 W77947 W72708 

AF086431 AA886756 AI557237 

AF086467W81444 W81445 

W95298 AF086529 AI912190 AW294159 AI458747 W94782 

AF086538 W95969 AI63191 1 W95835 • 

AA330095 W251 12 AA249401 

AL080280 T73124 H02689 AL080281 

D78667 D78871 C18258 

AA904776 AA405696 AA405962 

AW028820AI219068 

AI147202 W56755 W56710 

N79341 N99082 N47551 

AA180467 AA449184 AA464831 AA505048 

T55958 T57205AF147346 

AA011603N58604 N58611 

NMJH6102 AF156271 AA781868 AW152318 AW770403 AA909463 AA482996 AA758672 
AF156548 AA639797 AI675267 AI825497 AI823355 
AA463262 AA463615 AW160405 AW407583 

AW136181 AA581939 AK001221 AA694538 AA424043 AI016272 AA098960 AA884473 AI356180 BE391633 AA437086 
AI277866 AA098827 AA992680 BE172624 AA424101 AA320776 AW962967 N77431 AW858960 AW858897 T85649 
AA357743 AI827817 AI905672 

AI082335 W92924 BE048524 AW005302 AI084474 A1369330 AI827710 AW135506 AW298694 

AW1 60507 NM 013367 AF191338 AA384939 AI445790 AA730309 BE3970O3 BE267753 AI979163 N50386 AW583671 

AW583608 BE074466 BE074479 BE074471 AW976283 AA604393 AW162122 W73648 AI823475 N75898 W73713 

AW470099 AW513236 AWO25055 AW6131 15 A1923379 W58081 AW664525 AW196795 AI143619 A1565152 AA025406 

AA505846 AI665494 AA829964 N59156 N59163 R15442 AA826919 A161Q221 AT200120 AA603279 AW150822 AJ189513 

AI807122 AI016368 AI335868 AW583389 A) 193892 AI956157 AIB28879 AW591589 AW583446 A1955406 AW148396 

AI340255 AI867942 AA748525 AA876991 Z38516 AI874002 AI859474 N63100 AA429094 AA082443 

AW105663 AA693880 AW517398 A1768507 BE220851 AW978538 AA831489 

BE219300 BE327455 AL134620 R36741 R17996 

ALO31709 AI249061 AA907658 AI420444 
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308362 792518J 

307783 697809J 

301161 427238.1 

324094 270098 J 

309023 4737J 



316141 423880.2 AW303457 AA972713 AA724265 

323371 117336.2 N45114N51465 BE087338 AI083551 AL135118 BE395609 

307700 30923 J 1 BE2B0998 BE254670 BE294951 BE564979 AW405364 AA069256 AA129837 AI559667 BE281405 AW410850 BE041 153 

A1254811 AW301340 AI613335 AW301411 AJ609469 AI611607 AI611616 AI377623 A1335509 AI613544BE043165 A1371663 
AI340452 AI612066 AW072890 AI254558 AI349884 AI370095 AI613383 A161 1946 AI613353 AB07414 AI318229 AI612685 
AW305327 AW268924 AI370063 AI349292 BE049068 A1369098 AW274098 AI344845 AW075187 AI053401 AI345220 
BE138515 A1613386 AI583302 AW301955 AI349661 AI307432 AI054168 AI223913 A!612081 AI348942 AI334539 AI309366 
AI370098 A(252360 AW086316 AW26891 1 AW073482 AI379802 A1224284 AI053661 Al 334538 AI309369 A1309588 AI310023 
AI492709 AI335418 AI053999 AI366989 AW073478 AI247058 AI249584 AI305875 AI308585 AW071272 AI271487 AI340719 
AI365995 AI223673 AW271066 AI611938 AW071296 A1270796 AI254385 A1251393 AI252562 AW268236 AI254858 
AW071317 AI309102 AI609897 AW268971 AI583267 AI792484 AW0751 68 BE138443 AI254126 AI309822 AI310872 
A!61 1953 AI251054 AW276658 AI335405 AW075039 AI31 1768 AI612028 AW271895 AI612005 AI312240 AW271082 
AI371642 A1334879 AI310194 AI310772 AI345419 A1334675 AI223914 AI284707 AI284813 AI349140 AI254853 At313094 
AJ310170 AI309499 AI312476 AI376484 AI335467 AI340802 AI309815 AI310168 AJ61 1446 A1345824 BE327775 A1318545 
F17185AW614950 
AW998989AI613519 
AI347274AW844024 
AA731518AA765714 

BE395109 AW663898 AW237041 AJ492154 BE046906 AI651285 AI983290 AW002590 AI201Q40 F32424 AA992272 
AW271836 

AF180681 NM.015313 AA229509 AA225792 AA216413 AJ868045 BE005205 AB002380 T55518 BE276097 AW380669 
BE142836 AW370976 AA479384 R95425 A(680999 AA595138 H54582 At022709 T§5440 AI041769 AA861 144 AW392028 
AA479287 AA824634 A1638446 H54691 R95382 AA770352 A1640467 AW293491 AA779138 R28298 AA970562 C15590 
R84455 AA020769 AL036394 H8G566 BE5488B1 AA301207 AW959414 A1284253 AA043173 W52429 BE544571 R24852 
Z42603 F13120 R24340 R24326 T75305 H701 10 N56255 AA334210 F11453 AW947285 H80345 AA298992 AW380931 
AI267175 245421 AW380981 W86113 AA663590 AA167577 BE566760 BE169166 AA449904 AA459205 N31126 W03564 
N31208 AW993277 N44765 AW605275 D61449 W68572 AA258190 D60496 AW992964 U46277 H04097 AA370360 
AW957211 AA159775 AI631243 H83367 H21671 D61077 AW392712 N21112 H98522 N45298 N83629 AI393509 AW022043 
AA744886 AI580482 AA723286 AI422244 AI423984 D62804 AI088349 AA587890 AI144172 N33275 BE074397 H03399 
D62578 AI056639 AI829918 AA579584 AI089460 A1350124 W68573 AI580828 H98897 AB70468 H83715 W861 14 AA923123 
D57446 AA043174 AW337721 AI266551 AI140017 AW022356 D79855 D79650 D79393 D60495 AA788666 AA693443 
AW516977 W60139 AI628156 AW473223 AI608892 AA159670 AW440366 A1421529 T50751 AI174374 AA912234 AA724248 
AW780400 AA907218 H80514 D57452 AA863419 AA552618 D29614 R44556 T16452 R44935 Z41 132 D29188 H69692 
AI250176 AI078860 AA370359 AW183108 H74200 AA258183 F10723 C00323 R86148 AA860570 AW130073 AL079946 
AA410327 AA532614 AA234500 AI151507 AA410288 AW969839 AA483232 AJ383200 AA236540 AI807672 H73441 
323473 193878.1 AA262442 AA768862 AA262443 

315639 392767.1 AA827650 AA827652 AW629526 BE044585 AW974451 AA761439 AA648505 AA765803 
322878 117013.1 AA081820AA082191 AA079811 
301239 457668.1 AA807558 AA827117AW629567 

301256 16720 1 NM.016603AF251038 AI124624 AA776579 AW298470AI304868AW082724 A1348442BE218336N20641 AI018013 

AW858832 AW978157 AA815187 AA932948 AF157316 AI444958 W00848 W02935 AI434933 N26335 AA428681 AW371059 
AI651612 AW134937 AW96891 1 AA488815 AL157523 W48766 AW936954 AW936941 AW579205 AW936886 AW936889 
N74541 AW936953 AW578421 AW604352 AW357088 AW849258 AW849453 AW371606 AI554921 W49785 H99814 
AA805957 AA904606 AW206696 BE169229 AA333951 AA 190704 AW936944 AA463219 AA430306 AW805704 N48503 
BE222307A1638612 BE550045 AI805304 A!690987 AA776841 H12690 AW183731 AI380760 AI636261 AA812641 
AW592656 AI686132 AA843424 K99220 AW084996 AW128879 AI800871 AA610135 AA191524 AI150076 AI474530 
AA748461 N29013 AA746372 N59606 

N75450 AA877636 AW137945 W05248 AA514763 AW972399 AI758397 AW195051 
AW402931 BE393099 
AL036947T93676 T85475 

AA641735 AA281881 AA861209 AA934756 AA835887 AA641795 AA748822 AW295703 
AW467388 AA826954 

AF168711 AA099732 BE019157 AI380212 BE298159 AA249097 AA305112 AW962349 AW962353 AW401801 BE292961 
AM39469 AA442919 A1630537 AA724473 AI814288 AW966815 AI376871 AI860202 AI683132 AA099733 AW627633 
AI754022 BE206347 AW183349 AI378222 BE178926 A1473282 W52944 AW752469 AW966817 
AA301 270 AA30 1379 AA301366 

H85652 AA1 14024 AA296219 AA375304 AW963796 AW885952 AW020969 AA 114025 AI604930 BE350971 A1765355 
AW317067 AW974763 H85930 AW172600 AI310231 AW612019 D62908 D62864 AA652738 AI674617 AI494064 AW138666 
AI147620 A1147629 AW61 1793 AI668922 AI971005 AI864742 AA174171 

AK001701 AA134337 AA356202 BE163251 AW875175 AW875181 AW875177 BE163389 AK000741 AA247755 AA120819 
AW868040 AA3091 18 AW962348 AA471267 AW996843 AK001452 BE005344 BE617899 AA1 86588 AA1 20820 AW36331 1 
AA648105 N71529 BE168417 AW673900 AI8581 60 AA134338 AA659697 N22162 AI335437 AI31 1237 AI343171 AI336661 
AW268074 AW274348 AA935005 AW576295 AW262626 AW593153 AA730O55 AA662650 AA782687 AW894855 AI933533 
AW1 93002 AW899448 AW890142 AW812670 AA085664 AA334191 BE178085 BE180553 AA389680 AA984772 AA442527 
W26560 8E384359 AA847210 AW304931 AI669606 AA085613 AW197240 AI632828 AA581646 AW129348 A1017643 
AW089030 D20893 AI382955 AI557148 AW499979 
324231 975669.1 W60827 AL079968 AL047234 
324248 977901.1 AW504918 N55410AL1 18584 AW839266 

323691 221757.1 AA317561 AI793000AW235111 AI793176 AA767397 AI263113AA719462 



300611 337193.1 

324157 247225.2 

323509 967739.1 

323514 197787.1 

300674 466093.1 

322932 39838.1 



323591 209807.1 
322950 10774.1 



322957 29014.1 



288 



WO 02/30268 



PCT/US01/32045 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



315B58 406384 1 AA737345 AA682288 AT799378 

301431 569736 1 R05385 AI061251 

324303 233842 1 AL1 18754 AA333202 H38001 

324330 300543 1 AA884766 AW974271 AA592975AA447312 

300815 41537.2 BE1 52396 BE152395AA287515BE001 834 AA286678 AW406477 

324349 1154015.1 AW501470AW502931 AW499500 

323715 225129 1 AA322155 AA326396 AA326538 

309314 23273 -3 AW009312 

323758 229624 1 AA833858 AW978090AA327679AA8 10436 

309375 127 J AF286598 AW075342 AB028994 AL043713 AW378914 AA340650 N57166 AW956914 R17961 AA336481 BE393734 

AW977867 AW294638 AA927857 AA961627 AW303969 AW894416 AA8121 19 AA912758 AA424355 AA490582 W30941 
AA476693 AA131029 AA127777 AL043714 AA496984 T51 1 17 AA127722 AA594012 AI492876 N76483 AW1 19061 BE464926 
AW303419 AI972370AI76B172 AB26550 AI435432 AI379516 M778421 AI276089 AA424521 N59361 AA723153AA723176 
AI867487 AA090677 AI827221 AI351027 W02732 AI810729 AA142848 AI0821 10 N59379 N29744 AI283747 AI148665 
AW779845 AI382967 F34319 AI369934 AI282438 AW183449 AA863467 AA813469 AI092645 AI870701 AA8631 19 
T65475 R07576 T17017 F08143 Z43546 
T08845 Z43538 F06691 

BE560824 BE513941 AW238907 AA580852 AW501 176 BE241846 AW501 163 AW751433 AW501340 BE241715 AI910774 
AW406878 AW966560 AW966151 AW966496 AA336174 AA335376 AA335537 
R56151 W91936 
T52761 T52760 

AJ277841 AI630669 AI804370 Z41939 AW751251 AA299456 Z44739 AW860471 Z30158AW1 05391 H56997 W84688 
AA491201 W84636 M706815 AI131055 AA483636 AI005075 AW340034 AJ332372 AW1 18195 AI338932 AI191968 
AA693932 AI189982 AI193225 AA884163 AA594562 W37747 AA249754 AA746131 AI916540 AI832188 AW946555 
AA833838 Z40564 AA861563 F01447 AA887937 AI933559 AW973250 AA566018 AA313954 
AA354146 A! 1 84230 AA643525 
AA492588 AA492498 AA492571 
AA814859 AA814857 A1582623 
AW902251 AW168753 

X12830 NM.000565 AW503691 X58298 S72848 AA193347 AW503481 AW177946 AW178192 AW178188 AA285233 
AA410577 AA193465 AW177939 AW365459 BE221693 
AW207734 060164 D81 150 D81078 D61355 AW996804 
AW503101 AA309184 N56323R70998 
AW504161 AW503601 AW505509 

AF226667 AA207032 AA100804 AA121287 AA488316 AI808218 AW419048 AI91 1097 AW132123 AA50231 1 AW089948 
AA100952 AI075431 AW083432 AI990554 BE466029 F2B643 AF086422 W79581 AW439007 F37179 W79780 AW439035 
AA731381 AW750380 AA251012 AW589846 AA730238 AA329792 AW087255 AA220982 AA082469 AA877260 AA232380 
BE298910 

AA557952 AA677593 AA618150 
AW979189 AA837332 AA856946 AA876935 

AF1 1 1 178 NM.005708 AF105267 AW590040 AI979280 AA001322 BE146329 AA702430 AA702429 AA694221 AI206348 
AI206285 AW770197 AA923032 AI379586 AA701 165 AW594643 AA001909 AW002368 

AI739168 AA426249 AI199636 AW5Q5198 AW977291 AA824583 AA883419 AA724079 AI015524 AI377728 AW293682 
AS928140 AA731438 A10924O4 A1085630 AA731340 
AA631739 AA768584 AW134477 
AA640770 A16831 12 AA913009 
AFO90948 AI064898 All 1 1 182 

AB018257 BE148640 AA081832 AK001915 AF150217 AF161350 AJ219174 AW074664 D60040 AA346065 H28750 
AW151783 BE613360 BE612628 BE502031 AW183790 AA992580 AA505815 AI310432 AI678015 AW592679 AA879181 
AA806708 AI744110H24681 C16064 D62900AI285033 AA346064 AI865123 AW467798 BE221231 AL120676 N89877 
AI928370 AI358387 AA748486 AV64747B AV647460 AA312313 AI279340 AW505099 
AA005122 H49792 
AA476777 T86049 

AA437414 AA131479 AA086182 AB037775 AW161063 AW5 14393 AA332331 AW136197 BE150789 AA425533 AA249605 
N88308 AI016201 BE004662 AA291027 R57587 AA424277 AA476391 W07532T97036 AA218898 AW162629 R57770 
W01278 W90204 W90156 AL1 19197 R84513 AA280103 AA334994 AW965504 AA460868 AA447470 AW1 38594 W38898 
W90028 AI078353 W90078 AA699696 N35523 AA704225 AA035059 AW134892 AA1 15140 AI142854 H90084 AA826342 
AA460694 N46339 AA425344 N56953 AA035569 AI761083 AI658696 AI524818 AI338965 AW069249 AW299871 BE464061 
AI189720 AW340682 AI423380 AI275122 H17532 N80735 AA826343 AI039694 BE328398 AI192947 AW271286 AI623122 
A1922902 AW293087 N22141 AA730657 AW316610N26473 F06663 Z4361OH14783 R59761 H1 1540 AI265915 AI681773 
A1091748 BE220636 AW841861 AI702181 AI468447 AA907544 AI273941 AW244034 R37769 AA446663 T96929 BE045884 
AA476341 H89994 H29043 AW051211 N49522 AA306977 

302696 33570 1 AK000738 AA347452 AW961713 H70832 AI750643 AA362887 AW955588 W44974 AA279599 AW298762 AA452666 

AA443355 A1337273 AA446931 AI752977 AA661554 W42674 AI292172 R41 163 AA62 1381 AI244157 

302697 43219J AJ001409 AJ001410 

309917 57485.2 AW340014 AW866993 AV651649 

303347 192210.1 AA258033 AA459485 
303349 193138.1 AA382661 AW958642 AA259088 

310599 690880.1 AW300144 AI338491 AI798381 BE220076 
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325031 266373.2 
325045 1534945.1 
324473 38795 1 
323827 235506 1 
302270 1734192J 
301618 10967.5 
301646 42154.1 



323923 249295.1 

324580 328264.1 

316774 463723 1 

309577 6483.6 

302345 29533.1 

302358 1064753.1 

324614 215437 J 

324661 385257.1 

324685 41003.1 



324692 351987.1 

316893 473541.1 

303027 21796 J 

324715 290035.2 

324771 385085.1 

324783 389515.1 

303114 37417.1 

303124 21112.1 



302552 82290.1 
301918 316229.1 
303232 20474.1 
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25 



303388 969232 1 ALG39604 AL039497 

302761 45074 J AW250553 L07876 235843 R30693 A1190097 AW965317 

318455 606341 1 A! 148763 AI903763 AI903753 AJ903762 AI903800 At903801 

317850 363835J AI 58 1545 AI951714 A1570397 AW873588 AA836396 AI359986 AI499790 AA773477 AI951615 T07547 AW304709 AF1 14041 
■ 5 BE176629Z44580T30422 T32690AW953065H10602 

303431 32082 1 NMJXJ0539 AA019013 AA019367 AA056154 K38735 AA057003 AA021051 H38102 AA015774 AA059291 AA019439 H84843 

H83375 AA019914 AA017288 RB4449 W26519 H3825B AA018736 H84147 AA018577 AA059353 U49742 H38767 AA318341 
AA317553 H86646 H91989 AA317398 AA317378 W29024 W23034 T27877 AW950059 AA017195 R84262 AA057177 
HB9941 AA019904 H84662 AA015775 AA019368 AA020976 H37900 C20733 H38682 H85197 AA018578 AA017252 
10 AA019440 AA059059 H38651 H841 48 AA0 18560 W25754 C20752 AA31 7915 AW9521 15 AA317369M01 9845 R85402 

AA019492 AA017196 AA056093 AA056094 AA058836 AA056155 W25957 W23027 AA056159 W23043 W21890 W28951 
AA317978 W26459 AA317265 
319127 1653640J N49476Z45911 R21061 
303480 232749J AA331906 AA332484 
1 5 303481 31534 J AK001952 AA336839 AW249271 BE247287 AF 182002 BE613472 AW962673 AA332235 AW849937 AW849814 H49893 

AA477148 AW968944 AF 182003 AW007897 BE246145 W76100 AI480141 AW410205 AA609339 AI20911 1 AW000979 
AA330280 AW961554 W72865 H49894 AA514317 AA620407 AA504522 AW472B33 AA7 16609 AW129282 AA347351 
AA62B378 AW589860 AI636696 AA464632 AA464533 AW874189 AA757076 AA479654 AW517910 AW292357 AW872638 
AW262288 AI910666 AW513749 AW238771 AA215797 BE3870f73 

BE143533 AW850432 AK000042 AA333666 AA385314 AW966616 AW793068 AW793414 AA361 103 AW390841 AA040095 
AW385058 AW799162 AI383115 AI990745 AI653703 BE503693 AW150758 AI949919 AW190450 AW512348 A1625970 
AW501057 N52954 AE281378 A14O1710 AI648409 AW002659 AI687639 AI093943 R33960 AA040062 AI926267 AI240425 
A152091 1 AI093428 R52943 

303488 36085 1 AI040372 AB040915 W4C569BE158910BE158914 D63226 AW025860AW583088 AA334307AA210942 AW753212 

AW805322 AA362635 BE158911 AW891225 AW994862 AA805451 R28541 AA229347 N48266 AI377788 R28682 R36122 
AA81 1941 AI240742 AI632001 T99965 W01976 AW8912Q5 AW891177 T97433 C15571 AA346850 AA504293 W07500 
A1694503 AA489216 AA327725 AW959917 AA694146 N68514 AI076285 AW016248 T07783 AA642400 AA716133 AAB05332 
R0O312 AA705021 AW498605 AW891723 AW891906 AA808025 N29039 N74897 W60393 AA810184 A1627460 AW057516 
AA807436 AA760966 AB59295 N78642 N20662 AA830300 W81705 AA832258 AW891718 AI811796 AW515523 Z41735 
AA449978 AW891714 A1684539 AW891896 AW071701 AI890916 A1924994 AI039743 AA888524 AA244214 AI015736 
AI270105AI865077 

F30712 F35665 AW263888 AI904014 AI904018 AA336927 AA336502 
H08370Z46168 F07366 AA1 93168 AA1 93138 

AK000290 A1476034 AA465309 BE148761 AW303607 AW958665 AW469635 AIB19365 AI243857 AW469326 AA1571 10 
AA278626 AA496257 AA306656 F29732 AA831B59 AA312210 AA564476 AA579065 AA769522 AA740386 AI205635 
AA491 643 AA81 0400 AA417708 AI567332 AA157392 N53817 AA374229 
R68545 T271 19 R25687 AW750672 
H13364 T27135 R61679 AA746905 
H77679 

AB038995 NM.016530 AK001 1 1 1 AA465635 AW968716 U66624 AA885459 AA703019 AI040266 AI018689 AI692886 
A1125372 AI376796 AJ192040 N58161 AL133607 AW503673 AW505479 AA362265 AJ404671 
F11623H17552AA347728 

BE311816 AK000916 AW868037 AW868039 AF228527 Af752482 AW86B041 AA077049 AI201537 W55873 AA206019 
AA077918 AW968729 A1978828 AW139620 AI093053 AW204025 AI418805 AA598926 AA586345 AA045669 BE314455 
AA045668 

W01 166 AW996900 BE184300 Z44887 T34535 R51495 AW886575 AA295490 AA295162 AA295163 AW937125 T56951 
BE386106 W52674 

AW500106 BE241915 AW503971 NM.016542 AB040057 AA313812 AK000S56 W 16504 AI822088 AA259107 AA191319 
BE085957 AA309584 BE122687 AW952435 T84469 BE088194 BE088132 AA328562 BE092674 AA263102 T39634 
AW992380 R79391 R24392 H03060 AW675066 AI299952 AW020325 D25953 N75199 AA361425 AW612302 AW236333 
AW673897 AW953686 N22323 AA649166 A1377099 M03061 AI660072 AW276405 AA809779 AI803430 AW297484 
AW510384 AA814816 AA371522 D63035 AA953567 R79392 R24282 AA876831 AW297542 AI699023 AA992652 AI041436 
318704 799152J AI631602 AW589676Z28684Z24981 
318730 275116 1 Z32887BE349923AA398215AA399231 
303714 1155758 1 AW501336 AW501337 
304387 183612 1 AA236027 BE003275 

304398 10169 T AA195509 BE394661 AV660757 AA489161 BE165972 AW503705 AA262785 AF123320 Z78357 NM.014171 AF161488 

AA248971 BE568575 AA461410 AA165108 A1637731 H75454 AA372934 AW339334 BE568754 BE564697 BE567299 
AI681606 BE537269 AW1 97204 AA290890 AI189393 AW292463 AW470227 F27399 AW61 1942 BE566888 AW301701 
AI675761 AI628429AA164711 AI797753 AI656879A191 2690 AI675277 AI695099 A1094Q95 AW01 4158 BE091 059 AI201 748 
AW236961 AI038003 AI083606 AA401606 AI079405 AI073516 AI655537 AA401475 AI814532 AI079862 AI093789 AI422084 
A1216476 AI392760 AA926998 AA781782 Z25198 AI086377 A) 18551 1 A) 185539 Z28843 AI223792 AJ379563 AA706253 
AI433798 AI9218B5 H75455 AW025269 AI224100 AI08361 1 A1225057 AW1 96334 AI572254 AA761628 AI472801 AA283784 
303751 468554 1 AA830149 AW978407 M85983 AW503637 

319401 1323199 1 W00973 N56457 AW992226T84921 R01342 

319402 1003489J R86913R86901 H25352 R01 370 H43764 AW044451 W2 1298 
318807 1536467.1 F08434 Z42573 H28810 
319478 765461 1 AI524124 R06841 R06842 
318872 1534581.1 Z43108 F06295 R 13085 



30 



35 



40 



45 



50 



55 



60 



65 



303494 238389J 
319142 164820J 
302868 12593J 



318518 1205335.1 

318519 434741J 
304168 72494_-10 
302948 21445J 

319250 244351 1 

318644 17700J 



318674 204968.1 
304232 20640.2 
388 1 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



318885 94880.2 AA742999 243272 AA345258 AW956677 AA031942 

303841 79133 1 W19657 BE616760 BE259848 BE382680 BE615587 AI934464 AA322745 T07155 AW961 174 AA307302 241888 M621992 

AA188400 AW770608 AI147458 A114840B AI698291 AA972591 
303889 1777183J T19204T36109T38107 

319539 63198J R09027 AA344892 AA329574 AW955648 AW978708 AI567804 AI378935 AW014657 AI804134 R08922 N92947 BE5467B8 

318905 1536408 1 F08365 Z43395 R54298 

320187 396254,1 T99949 AA654769AA664550AW975264 

318996 65715J 244266 H06384 AV655948 

319635 163534 J R17531 AW960899 AA338366 AW673294 BE047729 BE047722 AA330746 AW841797 H05030 Al 142105 R12654 
319699 747196 1 AI458682 H24240R1 4537 R 18426 AW867082 
319713 1699356J R24204R15712T84695 

319761 75324 2 AW630974BE005208R84237AA724997AA334867AW955777R18816 

319764 86596J AA019827 R18947 K46852 

319808 7069 3 T58960 AA609160 AA621130AI927236 AA431075 

321040 193331.1 AA261830 AW967855 H26953 AA262478 

320409 43709J AA226B69 AA296516 AW959753 AA186390 AL359619 AA356195 AA148427 R22748 AI033624 BE548853 H95327 

AW579751 BE561649 AA397533 BE617136 AA236444 T89946 AA247450 N55777 W38725 AI743846 AI808406 AA922229 
AI051464 W04713 R1 1251 W19656 AI042319 AA489276 AI224533 H95274 AW269958 T8931 1 A1890088 AI862754 
AI830968 AI669336 AI589780 AA534557 AW273839 AI338155 AI126632 N83542 BE046048 AA807028 AA848107 
AW1 67978 AA976930 AA148428 AI289304 AI524262 A1625961 AA773469 AI222288 A1280054 AI242371 AA227222 
AA973329 AA296517 AA829436 AA234526 AJ 149769 AI567865 AA936939 AI590681 AW469308 AJ689531 AA486419 
A1422051 AI057252 AA626941 A1475352 AW247913 AB22370 AA670122 AW198Q34 AA486418 AI363794 AA380739 



319881 1585983.1 


H51299 H44619 H46391 R86024 H51892 T72744 


320488 368456 1 


AI817336 R32883 AA595590 A1743065 R31386 


321121 1545647.1 


W23285 H42714 F25381 F37215 


321205 81249 1* 


AA002047 N72537 H541 42 H 81 580 


321253 375160 1 


AA6 10649 AI699484 H59558 


*314fM3 1^125 1 


AA827082 AA732246 AA1 67611 AA830741 


^20630 176R5 2 


AA1 99847 AA41Q224 R53323 AW936567 AW936569 AW936568 AW936571 




AATfiQI?^ AAR3171R AWQ77666 W92553 




AAOfKIP'i WQ501Q WQ3335 AA249037 


00011 1 


AA0n7^74 AA0O7466 AI8 16886 


321348 41762 1 


Z49979 D61703 U30168 


314138 179960J 


AA740616 AA654854 AA229923 


320712 57156.2 


R66867 R65678 R82673 W73128 R83101 


321383 41924.1 


AW968556 AJ238555 AW968731 AJ002574 AA459446 H70260 AW977557 AA767351 




AI300460 AA907450 AA649224 T07415 A1536896 BE018515 AI279865 BE047421 


312996 187327.1 


AW368634 AI702169 AI245179 AW368646 BE545574 AA249018 AW368633 N27553 


306513 


AA989230 


306537 


AA991705 


306557 


AA994530 


306598 


A1000320 


306620 


A1000929 


306700 


A1022056 


308078 


AI472621 


306813 


AI066544 


306830 


AI075803 


306855 


AI083982 


329722 c14_p2 




329728 c14_p2 




306890 


AI092235 


308100 


AI475949 


308147 


AI498991 


306929 


A1124514 


308352 


AI610791 


308383 


AI624497 


308521 


AI689808 


308561 


AI701559 


308617 


AI738720 


308771 


A1809301 


308828 


AI824829 


308896 


AI858667 


303019 41850.1 


AF098363AF098365 


303084 44211 1 


AF174008 AF174027 AF174106 


305092 AA642912 




305169 


AA663131 


305177 


AA663591 


305235 


AA67O480 


305413 


AA724659 
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305849 


AAA61571 


305854 


AA862733 


307113 


Al 183686 


307130 


• A1185234 


305937 


AA883238 


305977 


AAOQ72Q3 


307451 


AI248615 


307513 


AI274307 


307848 


AI364188 


307871 


AJ368665 


307881 


Al 370434 


307932 


AJ230822 


307944 


AJ418246 


307954 


AM19692 


307965 


A1421641 


309245 


AI972447 


309271 






AW072861 




AW074330 






OU30UO 


AW1 37700 




AW151933 

nil 1 9 1 w»>J 






r io he 




ooc/cn r19 he 




QOCyfCO r 10 he 








OvaOOif 


AW29B076 


o/VJftdQ 


AW9Q74A4 

Mi %£X9I *f f rt 






JUil/UO 0 1 / DO_ 1 




304037 


T26438 


304039 


T47349 


304236 


W93278 


304257 


AA053294 


304382 


AA232873 


304405 


AA282572 


304561 


AA489792 


304569 


AA490934 


304787 


AA582678 


304921 


AA603092 


327819 c_5_hs 




304968 


AA614308 


306382 


AA966967 


331263 47479J 


AW780192 AA015718 W02571 


332252 1663967J 


N63882 T91174 
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TABLE 14B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 14. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (GI) numbers 

Strand: Indicates DMA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

332807 Dunham, I. etal. 

332808 Dunham, I. etal. 
332812 Dunham, I. etal, 
332901 Dunham, I. etai. 
333149 Dunham, I. etal 
333916 Dunham, tetal 
334026 Dunham, I. etal. 
334061 Dunham, I. etal. 
334073 Dunham, I. etal. 
334150 Dunham, I. etal. 
334379 Dunham, I. etaL 
334719 Dunham, I. etal 
334773 Dunham, I. etal. 
334893 Dunham, I. etai. 
334935 Dunham, I. etal. 
335146 Dunham, I. etal. 
335320 Dunham, I. etal 
335568 Dunham, I. etal 
335586 Dunham, I. etal 
335601 Dunham, t. etal 
336036 Dunham, I. etal 
336123 Dunham, L etal 
336268 Dunham, I. etal 
337173 Dunham, I etal 
337460 Dunham, L etal 
337685 Dunham, I. etal. 
337736 Dunham,!. etal 
337780 Dunham, L etal 
337965 Dunham, I. etal 
337976 Dunham, I. etal 
338030 Dunham, I. etal 
338112 Dunham, I etal 
338165 Dunham, I. etal 
338178 Dunham, I etal 
338427 Dunham, I. etal 
338506 Dunham, I. etal. 
338794 Dunham, I. etal. 
338910 Dunham, I. etal 
339047 Dunham, L etal. 
332864 Dunham, I. etal 
332933 Dunham, I etal 
333193 Dunham, I. etal 
333712 Dunham, I. etal 
333940 Dunham, I. etal. 
333942 Dunham, I. etal 
334287 Dunham, L etal 
334387 Dunham, I. etal 
334487 Dunham, L etal 
334913 Dunham, I. etal 
335109 Dunham, LeUl 
335250 Dunham, I. etal 



Strand 


NLposition 


Plus 


297686-297808 


Plus 


298277-298360 


Plus 


309688-310561 


Pius 


1841954*1842090 


Plus 


3574317-3574413 


Plus 


8298994-8299169 


Plus 


9196549-9196681 


Plus 


9686941-9687077 


Plus 


9792201-9792374 


Plus 


10529221-10529854 


Plus 


13908356-13908467 


Plus 


15778859-15779026 


Plus 


16235169-16235328 


Plus 


19302753-19302881 


Plus 


20108247-20108373 


Plus 


21491292-21491457 


Plus 


22542132-22542246 


Plus 


24935021-24935655 


Pius 


24990333-24990497 


Plus 


25044923-25045157 


Plus 


29019796-29019877 


Plus 


30051089-30051186 


Plus 


31997555-31998040 


Plus 


23624127-23624224 


Plus 


32536159-32536395 


Plus 


3547161-3547245 


Plus 


3850500-3850643 


Plus 


4113793-4113990 


Plus 


7034267-7034392 


Plus 


7166011-7166119 


Plus 


8072708-8072827 


Plus 


10391398-10391600 


Plus 


12205719-12205875 


Plus 


12800037-12800181 


Plus 


19685043-19685354 


Plus 


21221871-21221953 


Plus 


27114697-27114763 


Plus 


28795375-28795551 


Plus 


30760793-30760968 


Minus 


1390386-1390296 


Minus 


2035790-2035681 


Minus 


3832993-3832494 


Minus 


7286177-7286073 


Minus 


8523830-8523671 


Minus 


8552629-8552330 


Minus 


13294116-13293871 


Minus 


1394602M3945781 


Minus 


14432191-14432132 


Minus 


19463909-19463815 


Minus 


21325792-21325667 


Minus 


21952922-21952826 
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335288 Dunham, L etal 
335290 Dunham, I. etaL 
335549 Dunham. I. etaL 
335882 Dunham, I. etal. 
5 335864 Dunham, I. etaL 
335905 Dunham, I. et.al. 
336205 Dunham, I. etaL 
336276 Dunham, I. etal. 
336433 Dunham, ), eta). 

10 336605 Dunham, I. etal. 
336616 Dunham, I. eta). 
336679 Dunham, I. etal 
337043 Dunham, I. etal. 
337272 Dunham. I. etaL 

15 337357 Dunham, I. etal 
337393 Ounham.I.etaL 
337497 Dunham, I. etaL 
337646 Dunham,). etaL 
337920 Dunham, I. etaL 

20 338083 Dunham,). etal 
338220 Dunham, I. etaL 
338752 Dunham,). etal 
338763 Dunham,!. etaL 
338983 Dunham,). eta). 

25 339209 Dunham,). eta). 
325240 5866848 
329532 3983505 
329522 3983507 
329519 3983510 

30 329511 3983514 
325326 5866875 
325303 5866908 
325389 5866921 
. 325417 5866925 

35 325450 5866941 
325452 5866941 
325498 5856967 
325587 6682462 
325602 5866994 

40 325701 5867028 
325780 6381953 
329722 6065785 
329720 6065785 
329666 6272129 

45 329815 6624888 
329841 6672062 
325824 5867048 
325866 5867076 
325902 5867101 

50 325958 5867142 
326014 5867160 
329941 6165199 
330002 6623963 
326154 5867170 

55 326023 5867245 
326278 5867269 
330036 6042048 
326547 5867307 
326495 5867423 

60 326507 5867435 

326505 5887435 

326506 5867435 
326530 5867441 
326508 6682495 

65 330120 6671864 
330123 6671869 
326858 6552462 
326983 5867657 
327014 5867664 



Minus 


22304275-22303770 


Minus 


22309950*2230989 1 


Minus 


24666203-24666128 


Minus 


26690300-26690125 


Minus 


26694537-26594382 
26988888-26988719 


Minus 
Minus 


30477456*30477311 


Minus 


32093320-32093181 


Minus 


34067540-34067425 


Minus 


15616509-15616358 


Minus 


26021027-26020848 


Minus 


2035790-2035681 


Minus 


17407330-17407251 


Minus 


28241476-28241307 


Minus 


30906179-30906109 


Minus 


31471747-31471569 


Minus 


33371317-33371258 


Minus 


2648689-2648632 


Minus 


6051648-6051510 


Minus 


9318438-9318301 

WW 1 VTW WW 1 WW 1 


Minus 


14166440-14166104 


Minus 


26421374-26421 135 


IV tu IUO 


26628148-26628009 


Minus 


29908865-29908702 


MtniiQ 


32492953-32492593 




32301-32650 


Plus 


42937-43014 


Minus 


35265-35458 


Plus 


18407-18597 


Pius 


20965-21325 


Pius 


47726-48024 


Minus 


73556-73630 


Plus 


239672-239759 


Minus 


110635-110745 


Minus 


435379-435552 


Minus 

IVtll IUO 


704103-704202 


Plus 


173372-173930 


Plus 


126724-126967 


Plus 


79122-79251 


Minus 


72936-73046 


Plus 


63634-63873 


Minus 


1 12713-1 1?Q92 




207544-207741 

cyji v*t*t bvi i *t i 


Phis 


98307-98446 


Minus 


68431-68720 


Mint ic 


40181-40331 


Minus 


42450-42fl33 


Minus 


8*WO S7**OfcO 


Minus 


127729-127842 


Plus 


S3437-*v£50 


Minus 


MAKA. 10447 


Minus 




pi iic 


46097-46158 


Mimic 


7103-7170 


Plus 


171799-171896 


Plus 


75250-75903 


Plus 


117120-117216 


Minus 


623677-623870 


Plus 


11843-11930 


Minus 


13038-13111 


Minus 


8818-8949 


Minus 


9368-9509 


Minus 


303000-303122 


P0JS 


78904-79112 


Minus 


127553-127656 


Minus 


35311-35406 


Minus 


69337-69670 


Minus 


16023-16581 


Plus 


1017630-1017788 
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326930 


6456782 


Plus 


606950-607705 


326920 


6456782 


Minus 


4242542519 


327058 


6531965 


Plus 


2384268-2384835 


327061 


6531965 


Minus 


3486389-3486673 


327075 


6531965 


Ptus 


4041318-4041431 


327120 


6531970 


Minus 


6-1088 


330126 


6093735 


Plus 


82458-82623 


327157 


5866841 


Minus 


44084746 


327183 


5867442 


Ptus 


84317-84531 


327192 


5867445 


Minus 


194652-194764 


327288 


5867481 


Ptus 


48583-48773 


327469 


5867772 


Ptus 


145549-145708 


327489 


6004459 


Minus 


57796-58015 


327526 


6381882 


Minus 


97010-97123 


327574 


5867818 


Ptus 


68767-69126 


327665 


5867839 


Ptus 


141736-141900 


327752 


5867949 


Plus 


93721-94421 


327819 


5867968 


Minus 


92202-92717 


327796 


5867982 


Ptus 


85267-85405 


330260 


6671884 


Plus 


4520345269 


330282 


6671910 


Ptus 


39824114 


328076 


5868008 


Ptus 


72807-72865 


328121 


5868031 


Ptus 


153782-153850 


328190 


5868077 


Ptus 


21082-21165 


328227 


5868105 


Minus 


21082-21242 


327871 


5868131 


Minus 


88889-89221 


328018 


5902482 


Minus 


542547-543133 


328624 


5868246 


Minus 


120666-120836 


328744 


5868290 


Ptus 


138639-138722 


328799 


5868316 


Minus 


80771-80923 


328291 


5868363 


Minus 


144244-144434 


328329 


5868375 


Ptus 


191709-192239 


328369 


5868388 


Ptus 


75371-75583 


328385 


5868395 


Ptus 


369952-370155 


328397 


5868397 


Ptus 


344967-345063 


328412 


5868405 


Plus 


86427-86519 


328538 


5868485 


Plus 


38144243 


328656 


6004473 


Ptus 


792616-792729 


328638 


6004473 


Ptus 


294618-294903 


328903 


5868514 


Ptus 


23625-24468 


328960 


6456775 


Plus 


38547-38837 


330320 


5932415 


Minus 


54458-54697 


328993 


5868536 


Plus 


49160-50084 


329081 


5868602 


Ptus 


93368-93510 


329089 


5868614 


Ptus 


25805-26923 


329109 


5868626 


Plus 


102168-102273 


329192 


5868716 


Plus 


166936-167020 


329216 


5868726 


Minus 


71408-71707 


329224 


5868728 


Plus 


27422-27664 


329246 


5868732 


Minus 


250541-250792 


329415 


5868874 


Plus 


1011438-1011818 


329454 


5868887 


Plus 


51342-51593 
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TABLE 15: 169 GENES WITH SEQUENCE INFORMATION DEPICTED IN TABLE 16 

Table 15 depicts UnigeneK), UnigeneTitle, Primekey, Predicted Cellular Localization, and 
Exemplar Accession for all of the sequences in Table 16. The information in Table 15 is 
linked by EosCode to Table 16. 



Pkey: Unique Eos probesel identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UntgenelD: Unigene number 

Unigene TiUo: Unigene gene title 

EosCode: Internal Eos name 

Localization: Predicted cellular localization of gene product 



Pkey ExAccn UnfgenelD Unigene Title 



EosCode Localization 



100394 
100452 
101249 
101485 
101514 
101851 
102398 
102522 
102669 
103119 
103709 
104080 
104144 
104691 
105370 
106149 
106579 
107102 
107217 
108153 
109014 
109112 



110151 
112971 
113021 
114908 
114965 
116393 
116416 
117698 
117984 
118985 
119018 
119126 
120992 
121710 
121913 
122041 
122593 
123209 
124526 
126399 
126645 
126966 
127537 
128790 
129109 
129184 



D84276 

D87742 

L33881 

M24736 

M28214 

M94250 

U42359 

U53347 

U71207 

X63629 

AA037316 

AA402971 

AA447439 

AA011176 

AA236476 

AA424881 

AA456135 

AA609723 

D51095 

AA054237 

AA156790 

AA169379 

H04649 

H18836 

T17185 

T23855r 

AA236545 

AA250737 

AA599463 

AA609219 

N410Q2 

N51919 

N94303 

N95796 

R45175 

AA398246 

AA41S011 

AA428062 

AA431407 

AA453310 

AA489711 

N62096 

AA128075 

AI167942 

R38438 

AA569531 

AA291725 

AA491295 

W26769 

AA621604 



Hs.66052 

Hs.241552 

Hs.1904 

Hs.123072 
Hs.82045 

Hs.183556 

Hs.29279 

Hs.2877 

Hs.13804 

Hs.57771 

Hs.183390 

Hs.37744 

Hs.22791 

Hs.256301 

HS23023 

Hs.30652 

Hs.40808 

Hs.262036 

Hs.257924 

Hs.20843 

Hs.31608 

Hs.83883 

Ks.129836 

Hs.54973 

Hs.72472 

Hs.39982 

Hs.45107 

Hs.106778 

Hs.55028 

Hi^278695 

Hs.1 17183 

Hs,97594 



Hs.98732 
Hs.128749 
Hs.203270 
Hs.293185 

Hs.61635 

Hs.182575 

Hs.162859 

Hs.105700 

Hs.108708 

Hs.109201 



CD38 antigen (p45) PBC1 plasma membrane 

KIAA0268 protein . PAB7 not determined 

protein kinase C, lota OAA1 cytoplasmic 

setectin E (endothelial adhesion molecuJ ACC5 plasma membrane " 
RAB3B, member RAS oncogene family PFJ2 cytoplasmic 
midkine (neurite growth-promoting factor LBH9 
gfcHuman N33 protein form 1 (N33) gene, PDG3 
solute carrier family 1 (neutral amino a PFJ4 
eyes absent (Drosophila) homotog 2 LEM9 cytoplasmic 
cadherin 3, type 1 , P-cadherin (placenta LBG2 plasma membrane 
hypothetical protein dJ4620232 PD06 
kamkrein 1 1 PBA6 
hypothetical protein FU 13590 PDM3 
Homo sapiens beta-1 adrenergic receptor PAV1 
transmembrane protein with EGF-like and PDM9 
hypothetical protein MGC13170 P008 
ESTs PAA4 
KIAA1344 protein PAA3 
DKFZP586E1621 protein PDG8 
ESTs PBF1 
ESTs, Weakly simitar to Z223JWMAN ZINC 
hypothetical protein FU13782 BCU4 
Homo sapiens cONA FU1 1245 fis, clone PL 
hypothetical protein FU20041 PAV9 
transmembrane, prostate androgen induced 
KIAA1028 protein PD03 
cadherin-like protein VR20 PFJ6 
ESTs BCY2 
hypothetical protein MGC2648 PDV3 
ESTs OAB6 
ESTs PDT9 
ATPase, Ca++ transporting, type 2C, memb 
ESTs, Weakly similar to 154374 gene NF2 POM8 
Homo sapiens prostein mRMA, complete cds 
ESTs PBF8 
KIAA1210 protein PDG5 
prostate androgen-regulated transcript 1 PDV5 
ESTs; protease inhibitor 15 (PI15) BCU7 
Homo sapiens Chromosome 16 BAC done C1T 
etpha-methylacyl-CoA racemase PDOI 
ESTs, Weakly similar to ALU INHUMAN ALU S 
ESTs, Weakly similar to JC7326 amino aci PAV4 
transmembrane, prostate androgen induced 
six transmembrane epithelial antigen of PAA5 
solute carrier family 15 (H-t/pepb'de tra PD05 plasma membrane 
ESTs PAA6 not determined 

secreted frizzted-related protein 4 BCX2 secreted 
calciunVralmodunn-depertdent protein kin PFJ7 
CGt-86 protein PAV6 vesicular 

spondin 2, extracellular matrix protein CJA5 not determined 



plasma membrane 
plasma membrane 

plasma membrane 
not determined 

plasma membrane 
P0G7 

not determined 
PDG4 

plasma membrane 
CHA1 not determined 

plasma membrane 
mitochondrial 
secreted 

ER 

PAJ5 not determined 
- PAB2 plasma membrane 



vesicular 

PAZ1 not determined 

PAA2 plasma membrane 
plasma membrane 
PDY4 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



129404 
129534 
130760 
131425 
132964 
132967 
133179 
133330 
133520 
133724 
133724 
133944 
134110 
301805 
302005 
302881 
303506 



AA172056 

R73640 Hs.11260 

AA128997 Hs.18953 

AA219134 Hs.26691 
AA031360 

AA032221 Hs.61635 

U81599 Hs.66731 



U42360 
X74331 
U07919 
U07919 



Hs.71119 
Hs.74519 
Hs.75746 
Hs.75746 



303753 
308050 
310382 
310431 
310573 
310598 
310816 
311596 
313676 
314121 
314691 
314785 
314907 
315051 
315052 
316442 
317548 
317869 
318524 
319191 
319763 
320324 
320561 
320796 
321441 
322303 
322782 
322818 



324295 
324430 
324603 
324617 
324626 
324658 
324718 
330211 



AA045870 Hs.7780 
U41060 Hs.79136 
AI800004 Hs.142846 
A1869666 Ks. 1231 19 
AA508353 Hs.105314 
AA340605 Hs.105887 
030891 Hs.19525 
AW503733 Hs.9414 
A1460004 Ks.31608 
A1734009 Hs.127699 
A1420227 Hs.149358 
AW292180 Hs.156142 
AJ338013 Hs.140546 
AI973051 Hs.224965 
A1682088 Hs.79375 
AA861697 Hs.120591 
AI732100 Hs.187619 
AW207206 Hs.136319 
AI538226 Hs.32976 
AI672225 Hs.222886 
AW292425 

AA876910 Hs.134427 
AA760894 Hs. 153023 
AI654187 Hs.195704 
AW295184 Hs.129142 
AW291511 Hs.159066 
AF071538 

AA460775 Hs.6295 
AF071202 Hs.139336 
NM_006953Hs.159330 
AF038966 Hs.31218 
AW297633 Hs.118498 
W07459 Hs.157601 
AA056060 HS2Q2577 
AW043782 H&293616 
AF055019 Hs.2t906 
AA639902 Hs.104215 
AI146686 Hs.143691 
AA464018 Hs.184598 
AW016378 Hs£92934 
AA508552 Hs. 195839 
AI685464 

A1694767 Hs.129179 
A1557019 Hs.1 16467 



330762 
330790 



331099 
331490 
331689 
332247 
332396 
332697 
332798 
334447 
338255 



U31382 

AA449677 

T48536 

AA149579 

R35671 

N32912 

AA431407 

N58172 

AA340504 

T94885 



Hs£99867 
Hs.15251 
Hs.122764 
Hs.91202 
Hs.14846 
Hs£91039 
Hs.98802 



ESTs PAB4 
hypometicaf protein FU11264 PAJ3 
phosphodiesterase 9A PEE6 
ESTs PBA7 
ESTs PAA7 
six transmembrane epithelial antigen of PM1 7 
homeoboxB13 PFJ5 
Putative prostate cancer tumor suppresso PDM1 
primase, polypeptide 2A (58kD) PDM2 
aldehyde dehydrogenase 1 family, member 
aldehyde dehydrogenase 1 family, member 
Homo sapiens mRNA; cONA DKF2p564A072 (fr 
LIV-1 protein, estrogen regulated BCR4 
hypothetical protein PEU4 
MAD (mothers against decapentaptegic, OrPBJ6 
relaxinl^l) PBH3 
ESTs, Weakly similar to Homolog of rat Z PEG4 
hypothetical protein RJ22794 PBM4 
KIAA1488 protein PBY3 
hypothetical protein FU20041 PEU5 
KIAA1603 protein PCQ8 
ESTs, Weakly similar to A46010 X-Iinked PBH1 
ESTs PEN3 
ESTs PCW3 
ESTs PET5 
holocarboxylase synthetase (bk>thv{prop PBH8 
ESTs PBY2 
ESTs PBY1 
ESTs BFF8 
guanine nucleotide binding protein 4 CB07 
ESTs, Weakly similar to TRHYJWMAN TRICH 
ESTs PBM9 
ESTs PBJ7 
ESTs PBJ9 
ESTs PBQ6 
deoxyribonuclease II bela PBQ7 
hypothetical protein FU10188 PBJ1 
prostate epithelium-specific Ets transcr PEN1 
ESTs, Weakly similar to T17248 hypotheti PE07 
ATP-binding cassette, sub-family C (CFTR PBH5 
uroplate"n3 PEL9 
secretory carrier membrane protein 1 PBY4 
Homo sapiens LUCA-15 protein mRNA, splic 
ESTs CBF9 
Homo sapiens cDNA FU12166 fis, done MA 
ESTs PCQ7 
Homo sapiens clone 24670 mRNA sequence 
ESTs, Moderately similar to SPCNHUMAN S 
ESTs PBQ9 
Homo sapiens cONA: FLJ23241 fis, clone C 
ESTs PBM3 
ESTs, Weakly similar to I38022 hypotheti PBH4 
gbH88f04j(t NCLCGAP_Pr28 Homo sapiens 
Homo sapiens cONA FU13581 fis, clone PL 
small nuclear protein PRAC CBK1 

PBJ2 

guanine nucleotide binding protein 4 PEW1 
hypothetical protein PBM1 
TMPRSS2, transmembrane protease, serine 
ESTs PBQ4 
Homo sapiens mRNA; cONA OKFZp564D016 (fr 
ESTs PCM 
ESTs, Moderately similar to T14342 NSDf PBH7 
gb:za21f09.$1 Soares fetal liver spleen PBQ5 
gb:hw31a09jc1 NG_CGAP_Kid1 1 Homosapien 
transgeTm 2 PBQ8 
. PBH2 
PBY9 
PBY7 



secreted 
nuclear 

plasma membrane 
plasma membrane 
nuclear 

plasma membrane 

PDT1 mitochondrial 
PDT1 mitochondrial 
PAB9 cytoplasmic 
plasma membrane 
nuclear 
cytoplasmic 
secreted 

not determined 
not&termined 
plasma membrane 

plasma membrane^ 
plasma membrane 



not determined 
cytoplasmic 
PBM2not determined 

plasma membrane 



cytoplasmic 



plasma membrane 
plasma membrane 
not determined 
PBY8 not determined 



PBQInottetermined 
plasma membrane 
PCI2 not determined 
PBJ5 

not determined 
PBY6 not determined 

cytoplasmic 
*PCW6 

PBJ4 plasma membrane 

nuclear 

not determined 

cytoplasmic 

not determined 

PEL3 plasma membrane 

plasma membrane 

PCQ1 cytoplasmic 

nuclear 

not determined 
nuclear 

PBJ8 not determined 

secreted 

nuclear 

not determined 
not determined 
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401424 PFG2 
407122 H20276 Hs.31742 ESTs PEW7 
408430 S79876 Hs.44926 dipeptidylpepUdase IV (C026, adenosfne PEZ3 
408826 AF216077 Hs.48376 Homo sapiens clone HB-2 mRNA sequence 
5 409262 AK000631 Hs£2256 hypothetical protein aJ20624 PFG1 
409361 NM.005982HS54416 sine ocufis homeobox (Drosophila) homolo PEW3 
411096 U80034 Ms.68583 rratociiondria! intermediate peptidase PEZ9 
413125 BE244589 Hs.75207 glyoxalasel PFJ3 
413623 AA825721 Hs.246973 ESTs OBH6 

10 414422 AA147224 Hs337232 HomeoboxA13 PFC6 
415263 AA948033 Hs.130853 ESTs PE25 
417153 X57010 Hs.81343 'collagen, type II, alpha 1 (primary ost PFJ1 
418601 AA279490 Hs.86368 calmegin PFA1 
418848 AI820961 Hs.193465 ESTs PEY4 

15 418882 NWL004996HS.89433 ATP*binding cassette, sub-family C(CFTROBH2 
419839 U24577 Hs.93304 *rjhosphoBpaseA2, group VII (platelet-a PFH9 
421887 AW161450 Hs.109201 C6I-86 protein PFH2 
422083 NMJX)1141Hs.111256 'arachidonate 15-fipoxygenase, second ty PFH5 
424565 AW102723 Hs.75295 guanyiata cyclase 1 , soluble, alpha 3 PFA3 

20 425071 NM_013989Hs.154424 •delodinase, lodothyronlne, type If PFH6 
425710 AFO30880 solute carrier family, member 4 PFD4 

427958 AA4 18000 Hs.98280 potassium intermediate/small conductance PFH1 
428819 AL135623 Hs.193914 KIAA0575 gene product PFD6 
429900 AA460421 Hs.30875 ESTs PEZ7 

25 429918 AW873986 Hs.1 19383 ESTs PEY5 
430226 BE245562 Hs£551 adrenergic, beta-2-, receptor, surface PEZ4 
431217 NM_013427Hs.250830 RhoGTPase activating protein 6 PFG6 
431716 D89053 Hs.268012 fatty-acid-Coenzyme A ligase r long-chain PEZ1 
431992 NM_.002742Hs.2B91 protein kinase C, mu PFH4 

30 432189 AA527941 gb:nri30c04.s1 NCLCGAP_Pr3 Homo sapiens 

432244 A1669973 Hs.200574 ESTs PEW8 
432437 W07088 H&293685 ESTs PFG3 
432966 AA650114 Hs.325198 ESTs PEY3 
439176 AI446444 Hs.190394 ESTs, Wealdy similar to B28096 Bne-1 pr PEWS 

35 440260 AI972867 Hs.7130 copinelV PEW6 
440901 AA909358 HsJ28612 ESTs PFC8 
445424 AB028945 cortactin SH3 domain-binding protein PEZ6 

446320 AF126245 Hs.14791 •acytCoenzyme A dehydrogenase lamily, m 
447210 AF035269 phosphatidylserine-specrOcphosphoIipas PFH8 

40 449156 AF103907 Hs.171353 prostate cancer antigen 3, non-coding DD PEZ8 
449625 NM.014253 odz (odd Oz/terwn, Drosophila) homolog 1 PEZ2 

449650 AF055575 Hs.23838 calcium channel, voltage-dependent, L ty PFD2 
451939 U80456 Hs.27311 single-minded (Drosophila) homolog 2 PFJ8 
451982 F13036 Hs.27373 Homo sapiens mRNA; cDNA DKFZp56401763 (f 

45 452039 A1922988 ESTs PFD8 

452340 NM JX)2202Hs.505 ISL1 transcription factor, LIM/homeodoma PFG4 
452784 BE463857 Hs.151258 hypothetical protein FU21062 PFC5 
452946 X95425 Hs.31092 EphA5 PFH3 



plasma membrane 

PEY1 

nuclear 

nuclear 

mitochondrial 

cytoplasmic 



ER 



secreted 

plasma membrane 
cytoplasmic 

secreted 

plasma membrane 
plasma membrane 
nuclear 



plasma membrane 
nuclear 

cytoplasmic 
PFA2 



PFH7 



plasma membrane 
plasma membrane 

PFG9 plasma membrane 

nuclear 
cytoplasmic 
plasma membrane 
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TABLE 15A shows the accession numbers for those primekeys lacking a unigenelD in Table 
15. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset identifier number 

CAT number. Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



116393 131543J 



101485 18113J 
126399 17331 J 



132964 
129389 



94346J 
21074J 



129404 
107217 



121710 19266J 



121913 291015J 



102398 
.315051 
324826 
319191 



A1972402 AI634409 AI523716 A1799749 W44518 A1424438 AI688513 AI971048 AI686324 AW01 3854 AA588483 AA5281 1 1 A1627428 
AI582200 AI669296 A1826926 AI620526 AI669958 AI972458 AI924500 AA512903 W44517 AA335363 AW238997 BE300165 
BE250665 AA284195 AA523420 W52834 AI471970 AI952824 AW003820 AW009463 AA669796 AA1 14966 AI653342 AA1 15038 
AI342150 AI092100 AI96821 1 W51994 AI804005 AI201420 AM23210 AI738405 AI674964 AI970341 AW027500 AI493316 AI333193 
AI139353 AA599463 AI6561 63 AI804200 AI365321 AI990213 AI657011 AA650025 AI968810 AI341978 AA599839 AW592602 
AA644269 A1468578 AI565265 AI565228 BE221535 AW973052 

AA296520 AL021 940 M30640 NM.000450 M24736 M61894 AL047443 H39560 AI694691 AA916787 A1214796 AA939085 AI150616 
AA412553 AA412545 AI051015 T27654 AA694430 

AA088767 AF224278 AA128075 AL035541 AA027926 AI761441 AI972096 AW071693 A042327 AI377498 AI804815 AI640802 
AI885001 AJ921394 AA5951 15 N71820 AJ921217 AW007283 AI467828 AI369306 AA917446 AI493698 AA088701 AA126899 A1936228 
AW204238 AI039567 AJ925027 BE138909 AW452945 AW135998 AA310984 AA027860 AW073519 AI537597 AA953976 AI521341 
AW273569 AW050740 AA5361 13 AA559064 AJ474392 AW135709 AA535181 AW572959 AA570597 AI905464 AI677810 AI587642 
AW975102 AA424310AA482527 N64192M658276AW889117AA486591 AW889172AI381990 AI381991 AI673419 AI990950 
AA487031 AI272934 AI150565 AA22916B AW316722 AI142707 BE222398 AA614168 AA122026 AW338227 AA632457 AI968726 
AW369662 AA512956 AA541675 AA451748 AI250993 BE146418 AA122025 
AI362575 AIB05082 AW263421 AI432462 AA135870 AA031360 AA031604 AA298475 AA298464 

NM_012445 AB027466 BE407510 BE047605 AA047125 AW084003 AA149494 AA149490 AA292528 AA570505 AA526186 AW006250 
AW007762 A1341557 AI799666 AI972710 AI377966 AI962810 AI084783 AI458032 AI190971 AW148913 AA372354 AW970032 
AW007426 AA650188 AI123203 AI122890 AI280975 W73595 W73495 AI863238 AA374109 AA603986 AW149089 AW957523 
AJ307748 AI921067 A1336463 F24537 A1380460 AI367500 All 89309 AI814701 AI766921 AW572106 AA037024 AW072576 AA578293 
AI288103 AA235464 AW450642 AA574230 AW294024 A1589229 A1580733 AW512227 AA877009 A1660255 AW1 88597 AA558228 
A1572782 AA658397 AI274628 AI866359 AA864573 AI264439 AA621604 AW515493 AW243333 Z39737 AI567038 AA573997 
AA573559 AW236431 AI652870 A! 684973 AA034505 AA047126 

AI267700 AI720344 AA191424 AI023543 AI469633 AA172056 AW95B465 AA172236 AW953397 AA355086 
AL08Q235AA031750D81382AI480231 AI095947 A1560953 BE010721 AI87Q290 AA374945 AA125792 051527 D51556 AI685541 
D51559 AW1 17286 AA195741 AI675138 AW593439 AJ201885 T30590 AW952100 D51095 AA523864 W70O43 AA987586 AW21515 
AI205532 AA127069 AI337387 D51595 AI453785 AW075677 AW088359 C14287 C14284 

AF163474 NMJH6590 AF163475 AI761105 Af770098 AA410580 AA41 1616 AI590343 AI739050 AL050198 AI862645 AA419104 
AA513809 AA333032 AI816915 AW139625 AA640689 AI31 1391 AI627693 AW135514 AA41901 1 AI269149 AI245259 AI970008 
A1970017 AW139445 AA569503 AI761072 AI766179 AT759995 AI300776 A1670129 AW150770 AA226501 AA226220 . 
A1249368 AI742316 AA428062 AA442089 AI864189 BE349478 AI803475 AJ584049 BE552085 AI088609 AI264197 AI886144 AM 29474 
AI307145 BE181300 AW058403 AI696838 AW748598 AA442196 A1216428 
entrez_U42359U42359 

347217J AW292425 BE467167 A1702953 BE550961 BE222309 A1299348 AI693336 AA541708 

A1685464 AW971336 AA513587 AA525142 , 
NMJM2391 AF071538AB031549AI685592 AI745526 AA662204 AW130657AA662164 AW971121 AI668916 AA513274 AI991223 
A1979170 AW298436AA639821 AI859010AW5 13942 AI687669 AA662521 AA548598 AI345056 A1305374BE043418 AI432856 
AI334840 AI379796 A1492693 AI307915 BE042082 AI307834 A1307858 AI309488 BE042210 A1435670 A1371605 AI862491 AI284563 
AI306872 AI255044 A1254601 A 125 1236 AJ473073 AJ473042 A1432760 AI435664 A1336826 AI289365 AI369096 AJ862274 AI334871 
AI349863 AI250405 A1377617 AI309895 AI313017 AI862291 AI31 1936 AI378718 AI305722 AI306769 AI308888 A1334565 AI862296 
A1344230 A1435685AI344087AI378696 AI311209 AI435775 AI310611 AI311154 AI432289 AI431561 AI492681 AW32867 A1335288 
AW92796 AI432769 AI310299 AI432273 AI379820 A1275319 AI435753 AI609441 AI432767 AI369100 AI31 1420 AI349974 A1247157 
A1334677 AI270910 A1224320 AI305608 A1334489 AI377152 AI350012 A1370086 AI335053 AI306781 AI306750 AI334849 AI334874 
AI340380 AI307876 A1305974 AI305972 AI31 1 521 AI334872 AI862509 A131 1498 AI335051 AI289684 AI310859 AD 11 862 AI862483 
AJ492775 AI307906 A1492708 AI289693 A1340373 AI307910 AI31 1359 A1435653 AI334865 AI31 1492 AI492809 AI492690 AI431 576 
AI862268 AI311879 AJ303435 AI492792 AI862512 AI275321 AI431568 AI431564 AI307885 AI307926 AI435692 A1435778 AI310182 
AI308894 A1492707 A1492713 A1308560 AI307829 AI343234 AI580598 AW472796 AI34091 8 AI3 10243 A1309368 AI307920 AKB9665 
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338255 
330211 
332798 
334447 
332247 

332396 20265.1 



332697 13699J 



425710 
432189 
445424 



447210 7119J 
449625 8113J 



452039 B9513J 



A1306777 AW08S318 AW086292 AW086378 AJ310027 A1275293 A1369G82 AI34O900 AI306749 AI371558 AW086287 BE043803 
A1306793 A1306272 AI287948 AI270917 AI284816 AI336813 AK84546 AB08044 AI275290 AI270872 AI306795 AI289687 AI223570 
AJ305303 A1289577 AI287742 A1275284 AI306812 AI336701 AI371554 AI378719 AI344988 AI223631 AI335141 AI343222 AI284568 
A1305357 A1275270 AI345932 AI436549 AI307925 A131 1502 AI344238 AB43182 AI308508 AI305988 AI270790 AB79792 A1305647 
AI305410 AM32251 AI436517 A1343227 AJ305534 AI340387 AI271043 AG05499 AI271046 AI305962 A1289465 AC05378 A1289725 
AI310848 A1305848 AI289362 A1252964 AI307049 AI310831 AI306993 AI306796 AI224659 AI305969 A1349855 AI308164 A1306948 
A1284676 AI309155 AI343202 A1432785 AI306815 AI369081 AI270885 AI289699 AI435704 AI309647 AI305716 AI31 1281 A1287927 
AJ472995 AI340423 AI270958 AJ307069 A1305364 A1270807AI275306 AI311B90AI275263AI432750AJ289371 AJ432861 AI255113 
AI305709 A1473008 AI311 168 AI309711 AI377164 AI271201 AI289560 AI309710 AI306195 AI31 1201 AI287741 AI271066 AI432876 
A1275281 AI379795 AI472972 AI31 1967 A1306826 AI305465 AI270792 AI473019 AI305340 AI270922 AI305995 A1305462 AI254144 
A1270969 AI473012 AI305390 AI275278 A!223644 A1289692 AI250318 AI305372 AI289691 AI250521 AI306283 AI306814 AJ307933 
AI473160 AI432903 AI223720 AI254979 AI334862 A1306926 AI289541 AI432248 AI435722 AI435698 AI432859 AI310683 AI473175 
AI335144 A1289467 A1436489 AI306928 AM73033 AI305763 AI307868 A1307882 AI348959 AI435736 AI432857 AI432896 A1435735 
AI432283 AI473086 A1432863 AI473081 AI432825 AI307840 AI473164 AI432885 AI473166 At472982 AI435734 AI473060 AI473171 
AI432279 AI432882 AI334670 AI436512 AI432827 AI432852 AI473051 AI473077 AI435697 AI271509 AI492781 AI472983 AI473018 
A1432897 AI473043 AI432871 AI436536 AI473157 AI349715 AI432777 AI473016 AM73158 AI340369 AI307941 AW32773 AI377146 
AI492791 AI270950 AI305342 At284604 AJ306269 AI28481 1 AI27081 1 AI289347 AI334869 AI334852 AI31 1759 AI250382 AI309520 
AI289550 AI305721 AI340870 AI270901 AI308575 AI3079O4 AI340715 AI270941 AI309808 AI246867 AM73014 AI307039 AI289360 
AI473069 A1492786 AI344013 AI305876 AI438510 AJ340742 AI473028 AI307891 BE041871 BE041268 BE042340 BE041946 
BE041783 A1306173 AI201948 AI926972 AI275769 
CH22_6856FG_UNK_EM:AC00 
C_5_p2 

CH22J4FGJJJJNK.C4G1.G 
CH22_1745FG_387_7_L1NK_EM 

372969 J AA669097 AA513815 AA026798 AA676526 AA704429 AA704269 AW 118292 AA579216 N58172 

AW579842 BE156562 BE156690 BE156489 BE081033 AK001559 BE149402 M85387 AW367811 AW367798 R17370 AI908947 
AA382932 R58449 HI 8732 AA371231 AW962899 AA713530 AW892946 R53463 H1 1063 AW068542 Z40761 6E176212 BE176155 
W23952 W92188 AW374883 AA303497 AW954769 AA036808 BE168063 AW382073 AW382085 AL041475 H80748 AW78161 
BE463983 AI805213 AI761264 W94885 N94502 AI623772 AI419532 AI810302 A1634190 AW002516 AW150777 A1352312 AI367474 
AW204807 AI675502 AI337026 AW134715 BE32B451 AI123157 AI560Q20 AI300745 AI608631 AI248873 AA742484 AW051635 
H18646 AI245045 AA5071 1 1 AI640510 AI925594 AA1 15747 AA143035 AA151 106 

X51405 NM_0G1873 T1 1322 AL1 18886 BE328175 AW136009 BE467445 AW470313 AA774852 BE504139 AW501046 AA082792 
AW389231 AA370044 R36841 AA371457 C04813 R25791 R25556 AW895854 AW903819 AW895671 AW895677 BE159723 
AW895664 AW895597 AW895595 AW895665 AW888518 AI903724 F06081 1=08503 AL1 19462 AW895730 AW888516 R2651 1 
R26489 AA334126 AA327626 N85713 AW695998 AA223622 F05468 AA370749 W05590 M78202 AA371073 AW498607 R15017 
T16991 AA001282 AA001 138 AA551566AA330159AI922855AA383512M029603D82246 082171 T94933 H56545 AA348060 
AA176888 R96764 AW451817 AA385766 AA452618 A1690057 AA988822 BE549928 AA150901 W57992 AW899925 C05281 
AA932042 AA370980 AW962877 W04741 AA369982 AW385948 AA922466 N75882 AM22070 AI361256 AI680224 D57122 T94885 
R53266 R46713 T19071 AW796277 AA325333 F04719 F02334 AA358146 AA626597 AA358304 AW028099 AL1 19570 D57290 
058273 D57796 N48555 AI361969 AA329457 057225 AW024046 AA992606 AWQ221 16 AWQ21538 AA935645 H89870 H56546 
AW9612 19 AA453239 AW837541 N45521 BE216029 AA318877 AA327740 AW961 809 T92139 D53216 D52365 D53363 053312 
D531 16 AI547267 AA679935 AW026552 AW026418 A W1 90507 AI92771 0 AW244108 D50948 AW054991 AW021063 AW02251 1 
AA493436 AI365636 BE464751 AW149384 AA102442 AW771368 AI818251 AI126368 D51049 AI421542 AI559467 AW079779 
AW021048 AW023969 AW044214 A1458264 AA027274 AI620254 AWQ28917 BE21951 1 AA326242 N57561 AI971273 AA878328 
D57131 AA770662 AI309299 AI796767 MS 13338 W58076 A1556287 A1445573 AI880260 AA00191 9 AW339259 AM9261 0 AI49261 1 
R97692 A! 301425 AA722603 058361 AI350323 AA973926 AI431263 AA516126 AA865467 Al 925 177 N33443 AA001943 A1299371 
AI082412 AA665090 AA583433 H89871 AA977231 AI362219 AI056096 AJ270446 N67524 N22103 AW614224 AA744054 AW243622 
AI613188 AI929173 A1350243 A13621 38 AA744004 AA176661 D56787 AI 955625 AI393109 A1094769 A1479728 AI423107 AI955617 
AI034036 A1582196 AW264534 AI418961 AA570761 AJ343538 AA650341 AA992503 AA770004 AL039666 AI862675 AW 190335 
AA610274 AW418627 BE467472 D56786 T28749 AI217610 AI359556 T23523 AL040189 AA846222 AA651636 051280 A1888986 
A1521 167 AI340177 AW612815 AI625285 AA621607 AA177059 AA229768 AA829788 AI749682 AW190631 N75299 AA230089 
AI915632 BE069542 AA890020 AA528397 AA995390 BE503860 AA570812 AW339396 AM 97986 AI203725 AI282379 AA670375 
AA461513 F0172B AW243599 C00856 N75567 R95995 AA1 50932 A95961 AA648060 AA933800 AA927073 AA101 126 AA864190 
T93566BE167472 

AF030880 NM.000441 AC002467 AA385554 H23053 AW891838 AI139968 AA653057 AI695233 
AA527941 AI810608 AI620190 AA635266 

AB028945 T77648 F13328 AL157605 Z46212 AA304736 F1 1855 T66098 T30174 AW954164 AW 176301 AW748243 AA456428 
AI369958 AA938565 AW959613 Z42008 AA994779 AI683909 F11019 F10926 AI769597 A1752550 T65015 AI884314 AA643954 
Z41838 AW020147 AI038822 AW571822 AA299781 AA69492B AF 131790 BE0054 1 1 AI902476 AW082695 AA464384 R42750 
AW902301 AA464273 R05837 Z38294 H41098 AL134507 M86079 

AF035269 AF035268 NM_0 15900 T96213 U37591 AA156832 AA299371 AI084325 H95977 AJ765967 BE221465 AA156726 AT969563 
AW024539 AI436791 AI949451 AA843093 AI452756 AA824232 AI306667 T96131 AW207447 AW243556 AW957032 AI084332 
H95978 U30998 

NM^014253 AF100772 BE088769 AL022718 BE161779 AW863569 BE161640 AL039060 BE1 68542 AW296554 AA323193 AA235370 
AW779760 N48674 AI375997 R45432 059344 AI203107 F07491 R35360 R25094 AI913631 A1498402 T61382 AI016320 N45526 
T61415AA331486 

AT922988 H05475 AA021608 AW169947 AA913750 Z41614 AW800012 
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TABLE 15B shows the genomic positioning for those primekeys lacking unigene ID'S and 
accession numbers in Table 15. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 

Pkey: Unique number corresponding to an Eos probe set 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et a!." refers to the 

publication entitled The DNA 

sequence of human chromosome 22." Dunham L et aL, Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 


Strand 


NLposiUon 


334447 Dunham, l.etal 


Plus 


14308764-14308824 


332798 Dunham, I. etal. 


Minus 


232147-231974 


338255 Dunham, 1. eLa). 


Minus 


15242294-15242231 


330211 6013592 


Plus 


59158-59215 


401424 8176894 


Plus 


24223-24428 
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TABLE 11 AND SEQUENCE LISTING 

SEOtDNOtl BCU4 DMA SEQUENCE 

Nucleic AcWAccesstoni: NM.Q24915 

Coding sequence: 13-1890 (underfned sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

ATTGGATCAA ACATGTCACA AGAGTCGGAC AATAATAAAA GACTAGTGGC CTTAGTGCCC 60 
ATGCCCAGTG ACCCTCCATT CAATACCCGA AGAGCCTACA CCAGTGAGGA TGAAGCCTGG 120 
AAGTCATACT TGGAG A ATCC CCTGACAGCA GCCACCAAGG CCATG ATGAT CATTAATGGT 1 80 
G ATGAGG ACA GTGCTGCTGC CCTCGGCCTG CTCTATG ACT ACTACAAGGT TCCTCG AGAC 240 
AAGAGGCTGC TGTCTGTAAG CAAAGCAAGT GACAGCCAAG AAGACCAGGA GAAAAGAAAC 300 
TGCCTTGGCA CCAGTGAAGC CCAGAGTAAT TTGAGTGGAG GAG AAAACCG AGTGCAAGTC 360 
CTAAAGACTG TTCCAGTGAA CCTTTCCCTA AATCAAGATC ACCTGGAGAA TTCCAAGCGG 420 
GAACAGTACA GCATCAGCTT CCCCGAGAGC TCTGCCATCA TCCCGGTGTC GGGAATCACG 480 
GTGGTGAAAG CTGAAGATTT CACACCAGTT TTCATGGCCC CACCTGTGCA CTATCCCCGG 540 
GGAGATGGGG AAGAGCAACG AGTGGTTATC TTTGAACAGA CTCAGTATGA CGTGCCCTCG 600 
CTGGCCACCC ACAGCGCCTA TCTCAAAG AC GACCAGCGCA GCACTCCGGA CAGCACATAC 660 
AGCGAGAGCT TCAAGGACGC AGCCACAGAG AAATTTCGGA GTGCTTCAGT TGGGGCTG AG 720 
GAGTACATGT ATGATCAGAC ATCAAGTGGC ACATTTCAGT ACACCCTGGA AGCCACCAAA 780 
TCTCTCCGTC AGAAGCAGGG GG AGGGCCCC ATGACCTACC TCAACAAAGG ACAGTTCTAT 840 
GCCATAACAC TCAGCGAG AC CGGAGACAAC AAATGCTTCC GACACCCCAT CAGCAAAGTC 900 
AGGAGTGTGG TGATGGTGGT CTTCAGTGAA GACAAAAACA GAG ATGAACA GCTCAAATAC 960 
TGGAAATACT GGCACTCTCG GCAGCATACG GCGAAGCAGA GGGTCCTTGA CATTGCCGAT 1020 
TACAAGGAGA GCTTTAATAC GATTGGAAAC ATTGAAGAGA TTGCATATAA TGCTGTTTCC 1080 
TTTACCTGGG ACGTGAATGA AGAGGCGAAG ATTTTCATCA CCGTGAATTG CTTGAGCACA 1 140 
GATTTCTCCT CCCAAAAAGG GGTGAAAGGA CTTCCTTTGA TGATTCAGAT TGACACATAC 1200 
AGTTATAACA ATCGTAGCAA TAAACCCATT CATAGAGCTT ATTGCCAGAT CAAGGTCTTC 1 260 
TGTGACAAAG GAGCAGAAAG AAAAATCCGA GATGAAGAGC AGAAGCAGAA CAGGAAGAAC 1320 
GGGAAAGGCC AGGCCTCCCA AACTCAATGC AACAGCTCCT CTGATGGGAA GTTGGCTGCC 1380 
ATACCTTTAC AGAAGAAGAG TGACATCACC TACTTCAAAA CCATGCCTGA TCTCCACTCA 1440 
CAGCCAGTTC TCTTCATACC TG ATGTTCAC TTTGCAAACC TGCAG AGGAC CGGACAGGTG 1 500 
TATTACAACA CGGATGATGA ACGAGAAGGT GGCAGTGTCC TTGTTAAACG GATGTTCCGG 1560 
CCCATGGAAG AGGAGTTTGG TCCGGTGCCT TCAAAGCAGA TGAAAGAAGA AGGGACAAAG 1620 
CGAGTGCTCT TGTACGTGAG GAAGGAGACT GACGATGTGT TCGATGCATT GATGTTGAAG 1680 
TCTCCCACAG TGATGGGCCT G ATGGAAGCG ATATCTGAGA AATATGGGCT GCCCGTGGAG 1740 
AAGATAGCAA AGCTTTACAA GAAAAGCAAA AAAGGCATCT TGGTGAACAT GGATGACAAC 1800 
ATCATCGAGC ACTACTCGAA CGAGGACACC TTCATCCTCA ACATGGAGAG CATGGTGGAG 1860 
GGCTTCAAGG TCACGCTCAT GGAAATCTAjg CCCTGGGTTT GGCATCCGCT TTGGCTGGAG 1920 
CTCTCAGTGC GTTCCTCCCT GAGAGAGACA GAAGCCCCAG CCCCAGAACC TGGAGACCCA 1980 
TCTCCCCCAT CTCACAACTG CTGTTACAAG ACCGTGCTGG GGAGTGGGGC AAGGG ACAGG 2040 
CCCCACAGTC GGTGTGCTTG GCCCATCCAC TGGCACCTAC CACGGAGCCG AAGCCTGAGC 2100 
CCCTCAGGAA GGTGCCTTAG GCCTGTTGGA TTCCTATTTA TTGCCCACCT TTTCCTGGAG 2160 
CCCAGGTCCA GGCCCGCCAG GACTCTGCAG GTCACTGCTA GCTCCAGATG AGACCGTCCA 2220 
GCGTTCCCCC TTCA AGAGAA ACACTCATCC CGAACAGCCT AAAAAATTCC CATCCCTTCT 2280 
TTCTCACCCC TCCATATCTA TATCTCCCGA GTGGCTGGAC AAAATGAGCT ACGTCTGGGT 2340 
GCAGTAGTTA TAGGTGGGGC AAGAGGTGGA TGCCCACTTT CTGGTCAGAC ACCTTTAGGT 2400 
TGCTCTGGGG AAGGCTGTCT TGCTAAATAC CTCCAGGGTT CCCAGCAAGT GGCCACCAGG 2460 
CCTTGTACAG GAAGACATTC AGTCACCGTG TAATTAGTAA CACAGAAAGT CTGCCTGTCT 2520 
GCATTGTACA TAGTGTTTAT AATATTGTAA TAATATATTT TACCTGTGGT ATGTGGG CAT 2 580 
GTTTACTGCC ACTGGCCTAG AGGAGACACA GACCTGGAGA CCGTTTTAAT GGGGGTTTTT 2640 
GCCTCTGTGC CTGTTCA AGA GACTTGCAGG GCTAGGTAGA GGGCCTTTGG GATGTTAAGG 2700 
TGACTGCAGC TGATGCCAAG ATGGACTCTG CAATGGGCAT ACCTGGCGGC TCGTTCCCTG 2760 
TCCCCAGAGG AAGCCCCCTC TCCTTCTCCA TGGGCATGAC TCTCCTTCGA GGCCACCACG 2820 
TTTATCTCAC AATGATGTGT TTTGCCTG AC TTTCCCTTTG CGCTGTCTCG TGGGAAAGGT 2880 
CATTCTGTCT GAGACCCCAG CTCCTTCTCC AGCTTTGGCT GCGGGCATGG CCTGAGCTTT 2940 
CTGGAG AGCC TCTGCAGGGG GTTTGCC ATC AGGGCCCTGT GGCTGGGTCT GCTGCAGAGC 3000 
TCCTTGGCTA TCAGG AGAAT CCTGGACACT GTACTGTGCC TCCCAGTTTA CAAACACGCC 3060 
CTTCATCTCA AGTGGCCCTT TAAAAGGCCT GCTGCCATGT GAGAGCTGTG AACAGCTCAG 3120 
CTCTGAGTCG GCAGACTGGG GCTTCCTCCT GGGCCACCAG ATGGAAAGGG GGTATTGTTT 3180 
GCCTCACTCC TGGATGCTGC GTTTTAAGGA AGTGAGTGAG AAAGAATGTG CCAAGATACC 3240 
TGGCTCCTGT GAAACCAGCC TCAGG AG GG A AACTGGGAGA GAGAAGCTGT GGTCTCCTGC 3300 
TACATGCCCT GGGAGCTGGA AG AGAAAAAC ACTCCCCTAA ACAATCGCAA AATGATGAAC 3360 
CATCATGGGC CACTGTTCTC TTTGAGGGGA CAGGTTTAGG GGTTTGCGTT CGCCCTTGTG 3420 
GGCTGAAGCA CTAGCTTTTT GGTAGCTAGA CACATCCTGC ACCCAAAGGT TCTCTACAAA 3480 
GGCCCAGATT TGTTTGTAAA GCACTTTG AC TCTTACCTGG AGGCCCGCTC TCTAAGGGCT 3540 
TCCTGCGCTC CCACCTCATC TGTCCCTGAG ATGCAGAGCA GGATGGAGGG TCTGCTTCTA 3600 
GCTCAGCTGT TTCTCCTTG A GGTTGCGGAG GAATTG AATT GAATGGGACA GAGGGCAGGT 3660 
GCTGTGGCCA AG A AGATCTC CGAGCAGCAG TGACGGGGCA CCTTGCTGTG TGTCCTCTGG 3720 
GCATGTTAAC CCTTCTGTGG GGCCAAAGGT TTGCATCGTG GATCCAGCTG TGCTCCAGTC 3780 
TGTCCCCTCC TCCTCCACTC TGACTGCCAC GCCCCGGACC AGCAGCTTGG GGACCCTCCA 3840 
GGGTACTAAT GGGGCTCTGT TCTGAGATGG ACA AATTCAG TGTTGGAAAT ACATGTTGTA 3900 
CTATGCACTT CCCATGCTCC TAGGGTTAGG AATAGTTTCA AACATGATTG GCAG ACATAA 3960 
CAACGGCAAA TACTCGGACT GGGGCATAGG ACTCCAG AGT AGGAAAAAGA CAAAAGATTT 4020 
GCCAGCCTGA CACAGGCAAC CTACCCCTCT CTCTCCAGCC TCTTTATG AA ACTGTTTGTT 4080 
TGCCAGTCCT GCCCTAAGGC AGAAGATGAA TTGAAGATGC TGTGCATGTT TCCTAAGTCC 4140 
TTGAGCAATC ATGGTGGTG A CAATTGCCAC AAGGG AT ATG AGGCCAGTGC CACCAGAGGG 4200 
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TCGTGCCAAG TCCCACATCC CTTCCG ATCC ATTCCCCTCT GTATCCTCGG AGCACCCCAG 4260 
TTTGCCTTTG ATGTCTCCGC TGTGTATGTT AGCTG AACTT TGATGAGCAA AATTTCCTG A 4320 
GCGAAACACT CCAAAG AG AT AGGAAAACTT GCCGCCTCTT CTTTTTTGTC CCTTAATCA A 4380 
ACTCAAATAA GCTTAAAAAA AATCCATGGA AGATCATGGA CATGTGA AAT GAGCATTTTT 4440 
TTCTTTTCTT TTTTTTTTTT TTTTTTTAAC AA AGTCTGAA CTGAACAGAA CAAGACTTTT 4500 
TCCTC ATACA TCTCCAAATT GTTTAAACTT ACTTTATGAG TGTTTGTTTA GAAGTTCGGA 4560 
CCAACAGA AA AATGCAGTC A GATGTCATCT TGG AATTGGT TTCTAAAAGA GTAAGGCATG 4620 
TCCCTGCCCA GAAACTTAGG AAGCATGAAA TAAATCAAAT GTTTATTTTC CTTCTTATTT 4680 
AAAATCATGC TAATGCAACA GAAATAGAGG GTTTGTGCCA AATGCTATGA ACGGCCCTTT 4740 
CTTAAAGACA AGCAAGGGAG ATTGATATAT GTACAATTTG CTCTCATGTT TTT 

SEO ID N0:2 BCU4 Protein sequence: 
Protein Accession*: NPJ)79191.1 

1 11 21 31 41 51 
I I I I I I 

MSQESDNNKR LVALVPMPSD PPFNTRRAYT SEDEAWKSYL ENPLTAATKA MMUNGDEDS 60 
AAAIjGLLYDY YKVPRDKRLL SVSKASDSQE DQEKRNCLGT SEAQSNLSGG ENRVQVLKTV 120 
PVNLSLNQDH LENSKREQYS 1SFPESSA11 PVSGITWKA EDFTPVFMAP PVHYPRGDGE 180 
EQRV VBFEQT QYDVPSLATH S AYLKDDQRS TPDSTYSESF KDAATEKFRS AS VGAEEYMY 240 
DQTSSGTFQY TLEATKSLRQ KQGEGPMTYL NKGQFYAITL SETGDNKCFR HPISKVRSW 300 
MWFSEDKNR DEQLKYWKYW HSRQHTAKQR VLDIADYKES FNT1GN1EEI AYNAVSFTWD 360 
VNEEAKIFIT VNCLSTDFSS QKGVKGLPLM 1QIDTYSYNN RSNKPEHRAY CQIKVFCDKG 420 
AERKIRDEEQ KQNRKNGKGQ ASQTQCNSSS DGKLAA1PLQ KKSDITYFKT MPDLHSQPVL 480 
FIPDVHFANL QRTGQVYYNT DDEREGGS VL VKRMFRPMEE EFGPVPSKQM KEEGTKRVLL 540 
YVRKETDDVF DALMLKSPTV MGLMEA1SEK YGLPVEK1AK LYKKSKKGIL VNMDDNIDEH 600 
YSNEDTFILN MESMVEGFKV TLMEI 

SEQ ID N0:3 BCU7 DNA SEQUENCE VARIANT 1: 

Nucleic Acid Accession #: AA428062 

Coding sequence: 1-777 (entire sequence represents open reading frame) 

1 11 21 31 41 51 

I I I I I I 

ATGATAGCAA TCTCTGCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACCGTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTGAGATTT 360 

TTGGGCCAAA ATCTATCTGT ACGCACTGGA AGATATCGCT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATGCT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC TGTGTTTTCC AGGAGTTACG TCAAACTACC TGTACTGGTT TAAATAA 

SEO ID HO:A 6CU7 DNA SEQUENCE VARIANT Z 

Nucleic Acid Accession f: AA428062 

Coding sequence: 1-777 (entire sequence represents open reading frame) 



1 11 21 31 41 51 

I I I I I I 

ATGATAGCAA TCTCTGCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACCGTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTGAGATTT 360 

TTGGGCCAAA ATCTATCTGT ACGCACTGGA AGATATCGCT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATACT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC TGTGTTTTCC AGGAGTTACG TCAAACTACC TGTACTGGTT TAAATAA 

SEP ID NftS mi Pntfn segyence Variant!; 
Protein Accession #: none 

1 11 21 31 41 51 

I I I I I I 

MIAISAVSSA LLFSLLCEAS TWLLNSTDS SPPTNNFTDI EAALKAQLDS ADIPKARRKR 60 
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YISQNDMIAI LDYHNQVRGK VFPPAANMEY HVWDENLAKS AEAWAATCIW DHGPSYLLRF 120 

LGQNLSVRTG RYRSILQLVK PWYDEVKDYA FPYPQDCNPR CPMRCPGPMC THYTQMVWAT 180 

SNRIGCAIHA CQNMNVWGSV WRRAVYLVCN YAPKGNWIGE APYKVGVPCS SCPPSYGGSC 240 
TDNLCFPGVT SNYLYWFK 

SEQ ID N0:6 BCU7 Proton sequence Variant Z 
Protein Accession I: none 



1 11 21 

I I I 

MIAISAVSSA LLPS LLC EAS TWLLNSTDS 

YISQNDMIAI LDYHNQVRGK VFPPAANMEY 

LGQNLSVRTG RYRSILQLVK PWYDEVKDYA 

SNRIGCAIHT CQNMNVWGSV WRRAVYLVCN 
TDNLCFPGVT SNYLYWFK 



31 41 51 

! I I 

SPPTNNFTDI EAALKAQLDS ADIPKARRKR 60 

MVWDENLAKS AEAWAATCIW DHGPSYLLRF 120 

FPYPQDCNPR CPMRCPGPMC THYTQMVWAT 180 

YAPKGNWIGE APYKVGVPCS SCPPSYGGSC 240 



SEO ID N0:7 BCX2 ONA SEQUENCE 

Nudete Add Accession #; NMJXB014 

Coding sequence: 238-1278 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 SI 
I I I I I I 

GGCGGGTTCG CGCCCCG AAG GCTGAGAGCT GGCGCTGCTC GTGCCCTGTG TGCCAGACGG 60 
CGGAGCTCCG CGGCCGGACC CCGCGGCCCC GCTTTGCTGC CGACTGGAGT TTGGGGGAAG 120 
AAACTCTCCT GCGCCCCAG A AG A I IIC11C CTCGGCGAAG GGACAGCGAA AGATGAGGGT 1 80 
GGCAGGAAG A GAAGGCGCTT TCTGTCTGCC GGGGTCGCAG CGCG AGAGGG CAGTGCCATQ 240 
TTCCTCTCCA TCCTAGTGGC GCTGTGCCTG TGGCTGCACC TGGCGCTGGG CGTGCGCGGC 300 
GCGCCCTGCG AGGCGGTGCG CATCCCTATG TGCCGGCACA TGCCCTGG AA CATCACGCGG 360 
ATGCCCAACC ACCTGCACCA CAGCACGCAG GAGAACGCCA TCCTGGCCAT CGAGCAGTAC 420 
GAGGAGCTGG TGG ACGTGAA CTGCAGCGCC GTGCTGCGCT TCTTCTTCTG TGCCATGTAC 480 
GCGCCCATTT GCACCCTGGA GTTCCTGCAC GACCCTATCA AGCCGTGCAA GTCGGTGTGC 540 
CAACGCGCGC GCGACGACTG CGAGCCCCTC ATG AAGATGT ACAACCACAG CTGGCCCGAA 600 
AGCCTGGCCT GCGACGAGCT GCCTGTCTAT GACCGTGGCG TGTGCATTTC GCCTGAAGCC 660 
ATCGTCACGG ACCTCCCGGA GGATGTTAAG TGGATAGACA TCACACCAGA CATGATGGTA 720 
CAGGAAAGGC CTCTTGATGT TGACTGTAAA CGCCTAAGCC CCGATCGGTG CAAGTGTAAA 780 
AAGGTG AAGC CAACTTTGGC AACGTATCTC AGCAAAAACT ACAGCTATGT TATTCATGCC 840 
AAAATAAAAG CTGTGCAGAG GAGTGGCTGC AATGAGGTCA CAACGGTGGT GGATGTAAAA 900 
GAGATCTTCA AGTCCTCATC ACCCATCCCT CGAACTCAAG TCCCGCTCAT TACAAATTCT 960 
TCTTGCCAGT GTCCACACAT CCTGCCCCAT CAAGATGTTC TCATCATGTG TTACGAGTGG 1020 
CGTTCAAGG A TGATGCTTCT TGAAAATTGC TTAGTTGAAA AATGGAGAGA TCAGCTTAGT 1080 
AAAAGATCCA TACAGTGGGA AGAGAGGCTG CAGGAACAGC GGAGAACAGT TCAGGACAAG 1 140 
AAGAAAACAG CCGGGCGCAC CAGTCGTAGT AATCCCCCCA AACCAAAGGG AAAGCCTCCT 1200 
GCTCCCAAAC CAGCCAGTCC CAAGAAGAAC ATTAAAACTA GGAGTGCCCA GAAGAGAACA 1260 
AACCCGAAAA GAGTGTGAGC TAACTAGTTT CCAAAGCGGA GACTTCCGAC T7XCTTACAG 1320 
GATGAGGCTG GGCATTGCCT GGGACAGCCT ATGTAAGGCC ATGTGCCCCT TGCCCTAACA 1380 
ACTCACTGCA GTGCTCTTCA TAGACACATC TTGCAGCATT TTTCTTAAGG CTATGGTTCA 1 440 
Gill 11 CI II GTAAGCCATC AC AAG CCA TA GTGGTAGGTT TGCCCTTTGG TACAGAAGGT 1500 
GAGTTAAAGC TGGTGGAAAA GGCTTATTGC ATTGCATTCA GAGTAACCTG TGTGCATACT 1560 
CTAGAAGAGT AGGGAAAATA ATGCTTGTTA CAATTCG ACC TAATATGTGC ATTGTAAAAT 1620 
AAATGCCATA TTTCAAACAA AACACGTAAT TTTTTTACAG TATGTTTTAT TACCTTTTG A 1680 
TATCTGTTGT TGCAATGTTA GTGATGTTTT AAAATGTGAT GAAAATATAA TGTTTTTAAG 1740 
AAGGAACAGT AGTGGAATGA ATGTTAAAAG ATCTTTATGT GTTTATGGTC TGCAGAAGGA 1800 
TTTTTGTGAT GAAAGGGGAT TTTTTGAAAA ATTAGAGAAG TAGCATATGG AAAATTATAA I860 
TXjTGTTTTTT TACCAATGAC TTCAGTTTCT GTTTTTAGCT AGAAACTTAA AAACAAAAAT 1920 
AATAATAAAG AAAAATAAAT AAAAAGG AG A GGCAG ACAAT GTCTGG ATTC CTGTTTTTTG 1980 
GTTACCTGAT TTCCATGATC ATGATGCTTC TTGTCAACAC CCTCTTAAGC AGCACCAGAA 2040 
ACAGTGAGTT TGTCTGTACC ATTAGGAGTT AGGTACTAAT TAGTTGGCTA ATGCTCAAGT 2100 
ATTTTATACC CACAAGAGAG GTATGTCACT CATCTTACTT CCCAGGACAT CCACCCTGAG 2160 
AATAATTTGA CAAGCTTAAA AATGGCCTTC ATGTG AGTGC CAAATTTTGT TTTTCTTCAT 2220 
TTAAATATTT TCTTTGCCTA AATACATGTG AG AGGAGTTA AATATAAATG TACAGAG AGG 2280 
AAAGTTGAGT TCCACCTCTG AAATGAGAAT TACTTGACAG TTGGGATACT TTAATCAGAA 2340 
AAAAAGAACT TATTTGCAGC ATTTTATCAA CAAATTTCAT AATTGTGGAC AATTGGAGGC 2400 
ATTTATTTTA AAAAAC AATT TTATTGGCCT TTTGCTAACA CAGTAAGC AT GTATTTTATA 2460 
AGGCATTCAA TAAATGCACA ACGCCCA AAG GAAATAAAAT CCTATCTAAT CCTACTCTCC 2520 
ACTACACAGA GGTAATCACT ATTAGTATTT TGGCATATTA TTCTCCAGGT GTTTGCTTAT 2580 
GCACTTATAA AATGATTTGA ACAAATAAAA CTAGG AACCT GTATAC ATGT GTTTCATAAC 2640 
CTGCCTCCTT TGCTTGGCCC TTTATTG AG A .TAAGTTTTCC TGTCAAGAAA GCAGAAACCA 2700 
TCTCATTTCT AACAGCTGTG TTATATTCCA TAGTATGCAT TACTCAACAA ACTGTTGTGC 2760 
TATTGGATAC TTAGGTGGTT TCTTCACTG A CAATACTGAA TAAACATCTC ACCGGAATTC 



SEO ID NO^^PTOlynsequgnge; 
Protein Accession #: KP_003005.1 

1 II 21 31 41 51 
1 I I I I I 

MFLSILVALC LWLHLALGVR GAPCEAVRIP MCRHMPWNIT RMPNHLHHST QENAILAIEQ 60 

304 
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5 
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YEELVDVNCS AVLRFFFCAM YAPICILEFL HDPIKPCKSV CQRARDDCEP LMKMYNHSWP 120 
ESLACDELPV YDRGVCISPE AIVTDLPEDV KWIDITPDMM VQERPLDVDC KRLSPDRCKC 180 
KKVKPTLATY LSKNYSYVIH AKIKAVQRSG CNEVTTWDV KEIFKSSSP1 PRTQVPUTN 240 
SSCQCPHOJ> HQDVLIMCYE WRSRMMLLEN CLVEKWRDQL SKRSIQWEER UQEQRRTVQD 300 
KKKTAGRTSR SNPPKPKGKP PAPKPASPKK N1KTRSAQKR TNPKRV 



SEQ ID NO* CBK1 DNA SEQUENCE 

Nucleic Add Accession I: NM.032391 

Coding sequence 129-302 (undenlned sequences ccnespond to slart and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GTCCTTCCTC TCCTAGCCTA AGGCGTGCAA ACAGAGCGCC ACTGGGAGGC TGAAACCTTT 
AGGCCGATGC TTGCTTGCAA GGTCAGGCAA GCTGGATTCT GGTCCCCACC TTTGCAGAGA 
GAACAGC GAT GT TGTGCGCC CATTTCTCAG ATCAAGGACC GGCCCATCTT ACTACCTCCA 
AGAGTGCTTT TCTCTCTAAT AAGAAAACAT CTACTTTGAA ACATCTACTG GGCGAGACCA 
GGAGTGATGG CTCAGCCTGT AATTCTGGAA TTTCGGGAGG CCGAGGCAGG AAGATTCCTT 
GAGCACAGGA GTTCCAGACC AGCCTGGGCA ATGTAGCAAG ACGCTGTCTC TATTTATACA 
ATAAAATTTT TTT AAAAAAG G 



SEQ ID N0:1 0 CPK1 Protein sequence; 
Protein Accession t: NP_1 1 5767 



1 11 21 31 41 51 

I I I I I I 

MLCAHFSDQG PAHLTTSKS A FLSNKKTSTL KHLLGETRSD GSACNSGISG GRGRKIP 

SEQ ID N0:11 CHA1 DNA SEQUENCE 

Nudete Acid Accession #: NM.020182 

Coding sequence: 96-854 (underlined sequences correspond to start and stop codons) 



l 

I 

TCCTTGGGTT 
AACTGAAGGC 
TCATCATCAT 
ACTACAAGCT 
ATGCCCTGTC 
TCCCAGAGCC 
TCGCCCAGCG 
TCGACCTGCC 
CCTGCACCCT 
GCGCACCCCC 
GCCCCTGCCC 
GCATGGAGGG 
TCCAGCACCA 
CACACATCGC 
GACACCCTCT 
ACACTCCGCG 
GTGGCCCTCC 
GCACAAGCTA 
TTTGTTGAGC 
A 




GCAGAGCAGT 
GCCCCTAGAG 
CTAGGGTCCC 
CTTCTTAGAA 
CCTCCCACCT 
AGAGAGCTTG 
TGTGTCTTGA 



21 
I 

CGCCTGGGGG 
CTGCGAAACC 
ATGATGGTGA 
TCCTTCATCA 
TGCCTGTGGC 
GCCCCGCCTC 
CACCGCTTCC 
TCGCTGTCAG 
GACCCCGAGC 
ATCTTCGACA 
AACTCGGGCA 
ACCTACAGCG 
GGGCCGCCCT 
AGCGCAGCCA 
CAGGGGGGCC 
GAGGAGTGAG 
CCCTGTGTAT 
CAAAAAAAAA 
AGGCAAAAGA 



SEQ ID W0:12 CHA1 Protein sequence: 
Protein Accession* NP_064567 



31 
I 

TTCGTGGCCA 
AGGCAATGGC 
TGGTGGTGGT 
GCCGGCACAG 
CCTCGGAGAG 
GGCCCACCGA 
AGCCCACCTA 
ACGGGGAGGA 
AGCAGCTGGA 
GTGACCTGAT 
TCAGCGCCAC 
AGGTCATCGG 
CCTTGCTGGA 
TCTGGAGCAA 
GGGCTCGGGC 
AGGAAGGCGG 
AAATATTTAC 
AAGAAAAAAG 
AAAAAAATTT 



41 
I 

TGATCCCCGA 
GGAGCTGGAG 
GATCACGTGC 
CCAGGGGCGG 
CACAGTGTCA 
CCGCCTGGCC 
TCCGTACCTG 
GCCCCCACCC 
ACTGAACCGG 
GGATAGTGCC 
GTGCTACGGC 
CCACTACCCG 
GGGGACCCGG 
AGAGAAGGAT 
TGCGTAGGTG 
GGGGCGCAGC 
ATGTGATGTC 
AAAAAAAAAA 
CTACAGTAAA 



51 
I 

GCTGCTGGAG 
TTTGTTCAGA 
CTGCTGAGCC 
AGGAGAGAAG 
GGCAACGGAA 
GTGCCGCCCT 
CAGCACGAGA 
TACCAGGGCC 
GAGTCGGTGC 
AGGCTGGGCG 
AGCGGCGGGC 
GGGTCCTCCT 
CTCCACCACA 
AAACAGAAAG 
AAAAGGCAGA 
AACGCATCGT 
TGGTCTGAAT 
ACCACGTTTC 
AAAAAAAAAA 



60 
120 
1B0 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



1 11 21 31 41 51 

I I I I I I 

MAELEFVQII IIVWHMVMV WITCLLSHY KLSARSPISR HSQGRRREDA LSSEGCLWPS 
ESTVSGNGIP EPQVYAPPRP TDRLAVPPFA QRERFHRFQP TYPYLOHEID LPPTISLSDG 
EEPPPYQGPC TLOLRDPEOQ LELNRESVRA PPNRTIFDSD LMDSARLGGP CPPSSKSGIS 
ATCYGSGGRM EGPPPTYSEV IGHYPGSSPQ HQQSSGPPSL LEGTRLHHTH IAPLESAAIW 
SKEKDKQKGH PL 



SEQ ID Kai3 CJA5 DNA SEQUENCE 

Nudetc Add Accession*: NM_0 12445 

Coding sequence: 276-1271 {underlined sequences ccnespond to start and stop codons) 



60 
120 
180 
240 



11 



21 
I 



31 
I 



41 



305 



WO 02/30268 



PCT/US01/32045 



GCACGAGGGA AGAGGGTGAT CCGACCCGGG GAAGGTCGCT GGGCAGGGCG AGTTGGGAAA 60 

GCGGCAGCCC CCGCCGCCCC CGCAGCCCCT TCTCCTCCTT TCTCCCACGT CCTATCTGCC 120 

TCTCGCTGCA GGCCAGGCCG TGCAGCATCG AAGACAGGAG GAACTGGAGC CTCATTGGCC 180 

GGCCCGGGGC GCCGGCCTCG GGCTTAAATA GGAGCTCCGG GCTCTGGCTG GGACCCGACC 240 

5 GCTGCCGGCC GCGCTCCCGC TGCTCCTGCC GGGTGATGGA AAACCCCAGC CCGGCCGCCG 300 

CCCTGGGCAA GGCCCTCTGC GCTCTCCTCC TGGCCACTCT CGGCGCCGCC GGCCAGCCTC 360 

TTGGGGGAGA GTCCATCTGT TCCGCCAGAG CCCCGGCCAA ATACAGCATC ACCTTCACGG 420 

GCAAGTGGAG CCAGACGGCC TTCCCCAAGC AGTACCCCCT GTTCCGCCCC CCTGCGCAGT 480 

GGTCTTCGCT GCTGGGGGCC GCGCATAGCT CCGACTACAG CATGTGGAGG AAGAACCAGT 540 

10 ACGTCAGTAA CGGGCTGCGC GACTTTGCGG AGCGCGGCGA GGCCTGGGCG CTGATGAAGG 600 

AGATCGAGGC GGCGGGGGAG GCGCTGCAGA GCGTGCACGC GGTGTTTTCG GCGCCCGCCG 660 

TCCCCAGCGG CACCGGGCAG ACGTCGGCGG AGCTGGAGGT GCAGCGCAGG CACTCGCTGG 720 

TCTCGTTTGT GGTGCGCATC GTGCCCAGCC CCGACTGGTT CGTGGGCGTG GACAGCCTGG 780 

ACCTGTGCGA CGGGGACCGT TGGCGGGAAC AGGCGGCGCT GGACCTGTAC CCCTACGACG 840 

15 CCGGGACGGA CAGCGGCTTC ACCTTCTCCT CCCCCAACTT CGCCACCATC CCGCAGGACA 900 

CGGTGACCGA GATAACGTCC TCCTCTCCCA GCCACCCGGC CAACTCCTTC TACTACCCGC 960 

GGCTGAAGGC CCTGCCTCCC ATCGCCAGGG TGACACTGGT GCGGCTGCGA CAGAGCCCCA 1020 

GGGCCTTCAT CCCTCCCGCC CCAGTCCTGC CCAGCAGGGA CAATGAGATT GTAGACAGCG 1080 

CCTCAGTTCC AGAAACGCCG CTGGACTGCG AGGTCTCCCT GTGGTCGTCC TGGGGACTGT 1140 

20 GCGGAGGCCA CTGTGGGAGG CTCGGGACCA AGAGCAGGAC TCGCTACGTC CGGGTCCAGC 1200 

CCGCCAACAA CGGGAGCCCC TGCCCCGAGC TCGAAGAAGA GGCTGAGTGC GTCCCTGATA 1260 

ACTGCGT CTA AG ACCAGAGC CCCGCAGCCC CTGGGGCCCC CGGAGCCATG GGGTGTCGGG 1320 

GGCTCCTGTG CAGGCTCATG CTGCAGGCGG CCGAGGCACA GGGGGTTTCG CGCTGCTCCT 1380 

GACCGCGGTG AGGCCGCGCC GACCATCTCT GCACTGAAGG GCCCTCTGGT GGCCGGCACG 1440 

25 GGCATTGGGA AACAGCCTCC TCCTTTCCCA ACCTTGCTTC TTAGGGGCCC CCGTGTCCCG 1500 

TCTGCTCTCA GCCTCCTCCT CCTGCAGGAT AAAGTCATCC CCAAGGCTCC AGCTACTCTA 1560 

AATTATGGTC TCCTTATAAG TTATTGCTGC TCCAGGAGAT TGTCCTTCAT CGTCCAGGGG 1620 

CCTGGCTCCC ACGTGGTTGC AGATACCTCA GACCTGGTGC TCTAGGCTGT GCTGAGCCCA 1680 

CTCTCCCGAG GGCGCATCCA AGCGGGGGCC ACTTGAGAAG TGAATAAATG GGGCGGTTTC 1740 

30 GGAAGCGTCA GTGTTTCCAT GTTATGGATC TCTCTGCGTT TGAATAAAGA CTATCTCTGT 1800 
TGCTCAC 



35 SEQ ID NftMWS Protein afflffiQCS 
Protein Accession*: NP_036577 

1 11 21 31 41 51 

Af\ I I ' I I I I 

40 MENPSPAAAL GKALCALLLA TLGAAGQPLG GESICSARAP AKYSITFTGK WSQTAFPKQY 60 
PLFRPPAQWS SLLGAAHSSD YSMWRKNQYV SNGLRDFAER GEAWALMKEI EAAGEALQSV 120 
HAVFSAPAVP SGTGQTSAEL EVQRRHSLVS FWRIVPSPD WFVGVDSLDL CDGDRWREQA 180 
ALDLYPYDAG TDSGFTFSSP NFATI PQDTV TEITSSSPSH PANSFYYPRL KALPPIARVT 240 
LVRLRQSPRA FIPPAPVLPS RDNEIVDSAS VPETPLDCEV SLWSSWGLCG GHCGRLGTKS 300 

45 RTRYVRVQPA NNGSPCPELE EEAECVPDNC V 

SEQ ID NO: 15 LBH9 DNA SEQUENCE 

Nucleic Add Accession #: NM.002391 
50 Coding sequence: 26457 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CGGGCGAAGC AGCGCGGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCCTCCT 60 
CGCCCTGCTG GCGCTCACCT CCGCGGTCGC CAAAAAGAAA GATAAGGTGA AGAAGGGCGG 120 
CCCGGGGAGC GAGTGCGCTG AGTGGGCCTG GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 180 
CGGCGTGGGT TTCCGCGAGG GCACCTGCGG GGCCCAGACC CAGCGCATCC GGTGCAGGGT 240 
GCCCTGCAAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTGGGG 300 
TGCGtGTGAT GGGGGCACAG GCACCAAAGT CCGCCAAGGC ACCCTGAAGA AGGCGCGCTA 360 
CAATGCTCAG TGCCAGGAGA CCATCCGCGT CACCAAGCCC TGCACCCCCA AGACCAAAGC 420 
AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGACG CCAAGCCTGG ATGCCAAGGA 480 
GCCCCTGGTG TCACATGGGG CCTGGCCACG CCCTCCCTCT CCCAGGCCCG AGATGTGACC 540 
CACCAGTGCC TTCTGTCTGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 600 
ACTCCCCAGC CCCACCCCTA AGTGCCCAAA GTGGGGAGGG ACAAGGGATT CTGGGAAGCT 660 
65 TGAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT TTGTTCTTCC CCACAATTCC 720 
ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 780 
TAATAT 

70 SEQ ID K0:t6 IBH9 Protein seouence: 
Protein Accession #: NP_0023fl2 

1 11 21 31 41 51 

-.1 I I I I I 

ID KQHRGFLLLT LLALLALTSA VAKKKDKVKK GGPGSECAEW AWGPCTPSSK DCGVGFREGT 60 
CGAQTQRIRC RVPCNWKKEF GADCKYKFEN WGACDGGTGT KVRQGTLKKA RYNAQCOETI 120 
RVTKPCTPKT KAKAKAKKGK GKD 



55 
60 



306 



WO 02/30268 



SEO ID N0:17 LEM9 DMA SEQUENCE 

KudelcAddAaessiont: NM.00S244 

Coding sequence: 1-1617 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGGTAGAAC TAGTGATCTC ACCCAGCCTC ACTGTAAACA GCGATTGTCT GGATAAACTG 60 

AAGTTTAACC GTGCTGACGC TGCTGTGTGG ACTCTGAGTG ACAGACAAGG CATCACCAAA 120 

TCGGCCCCCC TGAGAGTGTC CCAGCTCTTC TCCAGATCTT GCCCACGTGT CCTCCCCCGC 180 

CAGCCTTCCA CAGCCATGGC AGCCTACGGC CAGACGCAGT ACAGTGCGGG GATCCAGCAG 240 

GCTACCCCCT ATACAGCTTA CCCACCTCCA GCACAAGCCT ATGGAATCCC TTCCTACAGC 300 

ATCAAGACAG AAGACAGCTT GAACCATTCC CCTGGCCAGA GTGGATTCCT CAGCTATGGC 360 

TCCAGCTTCA GCACCTCACC CACTGGACAG AGCCCATACA CCTACCAGAT GCACGGCACA 420 

ACAGGGTTCT ATCAAGGAGG AAATGGACTG GGCAACGCAG CCGGTTTCGG GAGTGTGCAC 480 

CAGGACTATC CTTCCTACCC CGGCTTCCCC CAGAGCCAGT ACCCCCAGTA TTACGGCTCA 540 

TCCTACAACC CTCCCTACGT CCCGGCCAGC AGCATCTGCC CTTCGCCCCT CTCCACGTCC 600 

ACCTACGTCC TCCAGGAGGC ATCTCACAAC GTCCCCAACC AGAGTTCCGA GTCACTTGCT 660 

GGTGAATACA ACACACACAA TGGACCTTCC ACACCAGCGA AAGAGGGAGA CACAGACAGG 720 

CCGCACCGGG CCTCCGACGG GAAGCTCCGA GGCCGGTCTA AGAGGAGCAG TGACCCGTCC 780 

CCGGCAGGGG ACAATGAGAT TGAGCGTGTG TTCGTGTGGG ACTTGGATGA GACAATAATT 840 

ATTTTTCACT CCTTACTCAC GGGGACATTT GCATCCAGAT ACGGGAAGGA CACCACGACG 900 

TCCGTGCGCA TTGGCCTTAT GATGGAAGAG ATGATCTTCA ACCTTGCAGA TACACATCTG 960 

TTCTTCAATG ACCTGGAGGA TTGTGACCAG ATCCACGTTG ATGACGTCTC ATCAGATGAC 1020 

AATGGCCAAG ATTTAAGCAC ATACAACTTC TCCGCTGACG GCTTCCACAG TTCGGCCCCA 1080 

GGAGCCAACC TGTGCCTGGG CTCTGGCGTG CACGGCGGCG TGGACTGGAT GAGGAAGCTG 1140 

GCCTTCCGCT ACCGGCGGGT GAAGGAGATG TACAATACCT ACAAGAACAA CGTTGGTGGG 1200 

TTGATAGGCA CTCCCAAAAG GGAGACCTGG CTACAGCTCC GAGCTGAGCT GGAAGCTCTC 1260 

ACAGACCTCT GGCTGACCCA CTCCCTGAAG GCACTAAACC TCATCAACTC CCGGCCCAAC 1320 

TGTGTCAATG TGCTGGTCAC CACCACTCAA CTAATTCCTG CCCTGGCCAA AGTCCTGCTA 1380 

TATGGCCTGG GGTCTGTGTT TCCTATTGAG AACATCTACA GTGCAACCAA GACAGGGAAG 1440 

GAGAGCTGCT TCGAGAGGAT AATGCAGAGA TTCGGCAGAA AAGCTGTCTA CGTGGTGATC 1500 

GGTGATGGTG TGGAAGAGGA GCAAGGAGCG AAAAAGCACA ACATGCCTTT CTGGCGGATA 1560 
TCCTGCCACG CAGACCTGGA GGCACTGAGG CACGCCCTGG AACTGGAGTA TTTATAG 

SEQ ID N0:18 LEM9 Protein sequence: 
Protein Accession*: NP_0Q5235 



1 11 21 31 41 51 

I 1 I I I I 

MVELVISPSL TVNSDCLDKL KFNRADAAVW TLSDRQGITK SAPLRVSQLF SRSCPRVLPR 60 

QPSTAMAAYG QTQYSAGIQQ ATPYTAYPPP AQAYGIPSYS IKTEDSLNHS PGQSGFLSYG 120 

SSFSTSPTGQ SPYTYOMHGT TGFYQGGNGL GNAAGFGSVH QDYPSYPGFP QSGYPQYYGS 180 

-SYNPPYVPAS SICPSPLSTS TYVLQEASHN VPNQSSKSLA GEYNTHNGPS TPAKEGDTDR 240 

PHRASDGKLR GRSKRSSDPS PAGDNEIERV FVWDLDETII IFHSLLTGTF ASRYGKDTTT 300 

SVRIGLMMEE MIFNLADTHL FFNDLEDCDQ IHVDDVSSDD NGQDLSTYNF SADGFHSSAP 360 

GANLCLGSGV HGGVDWMRKL AFRYRRVKEM YNTYKNNVGG LIGTPKRETW LQLRAELEAL 420 

TDLWLTHSLK ALNLINSRPN CVKVLVTTTQ LIPALAKVLL YGLGSVFP1E NIYSATKTGK 480 
ESCFERIMQR FGRKAVYWI GDGVEEEQGA KKHNMPFWRI SCHADLEALR HALELEYL 



SEO ID K0:19 0AA1 DMA SEQUENCE 

Nudefc Acid Accession t: NM_002740 

Coding sequence: 178-1 968 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CCGCGGTTCC GGCTGCTCCG GCGAGGCGAC CCTTGGGTCG GCGCTGCGGG CGAGGTGGGC 60 

AGGTAGGTGG GCGGACGGCC GCGGTTCTCC GGCAAGCGCA GGCGGCGGAG TCCCCCACGG 120 

CGCCCGAAGC GCCCCCCGCA CCCCCGGCCT CCAGCGTTGA GGCGGGGGAG TGAGGAGATG 180 

CCGACCCAGA GGGACAGCAG CACCATGTCC CACACGGTCG CAGGCGGCGG CAGCGG GGAC 240 

CATTCCCACC AGGTCCGGGT GAAAGCCTAC TACCGCGGGG ATATCATGAT AACACATTTT 300 

GAACCTTCCA TCTCCTTTGA GGGCCTTTGC AATGAGGTTC GAGACATGTG TTCTTTTGAC 360 

AACGAACAGC TCTTCACCAT GAAATGGATA GATGAGGAAG GAGACCCGTG TACAGTATCA 420 

TCTCAGTTGG AGTTAGAAGA AGCCTTTAGA CTTTATGAGC TAAACAAGGA TTCTGAACTC 480 

TTGATTCATG TGTTCCCTTG TGTACCAGAA CGTCCTGGGA TGCCTTGTCC AGGAGAAGAT 540 

AAATCCATCT ACCGTAGAGG TGCACGCCGC TGGAGAAAGC TTTATTGTGC CAATGGCCAC 600 

ACTTTCCAAG CCAAGCGTTT CAACAGGCGT GCTCACTGTG CCATCTGCAC AGACCGAATA 660 

TGGGGACTTG GACGCCAAGG ATATAAGTGC ATCAACTGCA AACTCTTGGT TCATAAGAAG 720 

TGCCATAAAC TCGTCACAAT TGAATGTGGG CGGCATTCTT TGCCACAGGA ACCAGTGATG 780 

CCCATGGATC AGTCATCCAT GCATTCTGAC CATGCACAGA CAGTAATTCC ATATAATCCT 840 

TCAAGTCATG AGAGTTTGGA TCAAGTTGGT GAAGAAAAAG AGGCAATGAA CACCAGGGAA 900 

AGTGGCAAAG CTTCATCCAG TCTAGGTCTT CAGGATTTTG ATTTGCTCCG GGTAATAGGA 960 

AGAGGAAGTT ATGCCAAAGT ACTGTTGGTT CGATTAAAAA AAACAGATCG TATTTATGCA 1020 

ATGAAAGTTG TGAAAAAAGA GCTTGTTAAT GATGATGAGG ATATTGATTG GGTACAGACA 1080 

GAGAAGCATG TGTTTGAGCA GGCATCCAAT CATCCTTTCC TTGTTGGGCT GCATTCTTGC 1140 

TTTCAGACAG AAAGCAGATT GTTCTTTGTT ATAGAGTATG TAAATGGAGG AGACCTAATG 1200 

TTTCATATGC AGCGACAAAG AAAACTTCCT GAAGAACATG CCAGATTTTA CTCTGCAGAA 1260 



307 
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ATCAGTCTAG 
GACAATGTAT 
GAAGGATTAC 
CCTGAAATTT 
CTCATGTTTG 
CCTGACCAGA 
CCACGTTCTC 
AAGGAACGAT 
TTCCGAAATG 
AATATTTCTG 
CAGCTCACTC 
TTTGAGTATA 
AACCATGTAT 
TACAATTAAC 
ACTATATGAA 
TCCAGACAAT 
ATGAGTAATG 



CATTAAATTA 
TACTGGACTC 
GGCCAGGAGA 
TAAGAGGAGA 
AGATGATGGC 
ACACAGAGGA 
TGTCTGTAAA 
TGGGTTGTCA 
TTGATTGGGA 
GGGAATTTGG 
CAGATGACGA 
TCAATCCTCT 
TCTACTCATG 
CATTTTATAT 
TCAATTATTA 
CATGTCAAAA 
AAGTTACCTT 



TCTTCATGAG 
TGAAGGCCAC 
TACAACCAGC 
AGATTATGGT 
AGGAAGGTCT 
TTATCTCITC 
AGCTGCAAGT 
TCCTCAAACA 
TATGATGGAG 
TTTGGACAAC 
TGACATTGTG 
TTTGATGTCT 
TTGCCATTTA 
TTGCCACCTA 
CATCTGTTTT 
TTTAGTTGAA 
TTTTGTTTAA 



CGAGGGATAA 
ATTAAACTCA 
ACTTTCTGTG 
TTCAGTGTTG 
CCATTTGATA 
CAAGTTATTT 
GTTCTGAAGA 
GGATTTGCTG 
CAAAAACAGG 
TTTGATTCTC 
AGGAAGATTG 
GCAGAAGAAT 
ATGCATGGAT 
CAAAAAAACA 
ACTATGAAAA 
CTGGTTTTTC 
AAAAAAAAAA 



TTTATAGAGA 
CTGACTACGG 
GTACTCCTAA 
ACTGGTGGGC 
TTGTTGGGAG 
TGGAAAAACA 
GTTTTCTTAA 
ATATTCAGGG 
TGGTACCTCC 
AGTTTACTAA 
ATCAGTCTGA 
GTGT CTGA TC 
AAACTTGCTG 
CCCAATATCT 
AAAAATTAAT 
AGTTTTTAAA 
G 



TTTGAAACTG 
CATGTGTAAG 
TTACATTGCT 
TCTTGGAGTG 
CTCCGATAAC 
AATTCGCATA 
TAAGGACCCT 
ACACCCGTTC 
CTTTAAACCA 
TGAACCTGTC 
ATTTGAAGGT 
CTCATTTTTC 
CAAGCCTGGA 
TCTCTTGTAG 
ACTACTAGCT 
AGGCCTACAG 



SEQIDNO:2D OAA1f 
Protein Accession «: 



NPJW2731 



MSHTVAGGGS 
WIDEEGDPCT 
RRWRKLYCAN 
CGRHSLPQEP 
GLQDFDLLRV 
SNHPFLVGLH 
HERGZ I YRDL 
YGFSVDWWAL 
ASVLKSFLNK 
DNFDSQFTNE 



11 
I 

GDHSHQVRVK 
VSSQLELEEA 
GHTFOAKRFN 
VMPMDQSSMH 
IGRGSYAKVL 
SCFQTESRLF 
KLDNVLLDSE 
GVLMFEMMAG 
DPKERLGCHP 
PVQLTPDDDD 



21 
I 

AYYRGDIMIT 
FRLYELNKDS 
RRAHCAICTD 
SDHAQTVIPY 
LVRLKKTDRI 
FVIEYVNGGD 
GHIKLTDYGM 
RSPFDIVGSS 
QTGFADIQGH 
IVRKIDQSEF 



31 
I 

HFEPSISPEG 
ELLIKVFPCV 
RIWGLGRQGY 
NPSSHESLDQ 
YAMKWKKEL 
LMFHMQRQRK 
CKEGLRPGDT 
DNPDQNTEDY 
PFFRNVDWDM 
EGFEYINPLL 



41 
I 

LCNEVREMCS 
PERPGMPCPG 
KCINCKLLVH 
VGEEKEAMNT 
VNDDEDIDW 
LPEEHARFYS 
TSTFCGTPNY 
LFQVILEKQI 
MEQKQWPPF 
MSAEECV 



51 
I 

FDNEQLFTMK 
EDKSIYRRGA 
KKCHKLVTIE 
RESGKASSSL 
QTEKHVFEQA 
AEISLALNYL 
IAPEILRGED 
RIPRSLSVKA 
KPNISGEFGL 



1 
I 

CCAGGCGGCG 
GCCGCCGCCG 
TGCCCGCCGC 

CTGGGACTGG 
CACGGTCCTC 
CTATCTCTCC 
TGCCTTGGGA 
AAGAAGTCGG 
CACCACGCTG 
AGGGATCATG 
CAAAATTATG 
CTACGTCTAC 
ACCCCTGTTC 
CCTGTCGAGG 
CCTGGAGGGC 
TGTTTTGGTA 
TGTGTACTCC 
GGAGGTGGAG 
GGTGTTATAC 
CGACCTGATG 
CACGAAGGCC 
CCTGCAGACC 
CAAGACCGCT 
AAAATCCTCC 
GGACTTGGCC 
CTACCTCCTG 
CATGGTGCCC 
GAAGAGCAAA 
AAAGCTTTAT 
GCTGAAGGTG 
CACGCCCTTT 
CATCCTGGAT 
CCTGAACATT 
CCTGAGGATC 
CAAAGACGGC 
GAGCGACCCT 



11 

I 

TTGCGGCCCC 
CCGCCGCCAG 

ACCGGCATGG 
AATGTCACGT 
GTGTGGGTGC 
CGACATGACC 
TTTTTGCTGT 
GGCATATTCC 
CTTGCTACCT 
CTCACTTTCT 
ACAGCCTTAA 
TTTTCCCTCT 
TCGGAAACCA 
ATCACCTTCT 
AGTGACCTCT 
AAGAACTGGA 
TCCAAGGATC 
GCTTTGATCG 
AAGACCTTTG 
ATGTTTTCCG 
CCAGACTGGC 
CTCGTGCTGC 
GTCATTGGGG 
ACGGTCGGGG 
ACGTACATTA 
TGGCTGAATC 
GTCAATGCTG 
GACAATCGGA 
GCCTGGGAGC 
CTGAAGAAGT 
CTGGTGGCCT 
GCCCAGACAG 
CTCCCCATGG 
TTTCTCTCCC 
GGGGGCACGA 
CCCACACTGA 



21 
I 

GGCCCCGGCT 
CGCTAGCGCC 
AGCAACCGGG 
CGCTCCGGGG 
GGAATACCAG 
CTTGTTTTTA 
GAGGCTACAT 
GGATCGTCTG 
TGGCCCCAGT 
TTTTAATTCA 
GGCTGGTAGC 
AAGAGGATGC 
TACTCATTCA 
TCCACGACCC 
GGTGGATCAC 
GGTCCTTAAA 
AGAAGGAATG 
CTGCCCAGCC 
TCAAGTCCCC 
GGCCCTACTT 
GGCCGCAGAT 
AGGGCTACTT 
ACCAGTACTT 
CTGTCTATCG 
AGATTGTCAA 
ACATGATCTG 
TGGGCCCTTC 
TGATGGCGAT 
TCAAGCTGAT 
TGGCATTCAA 
CTGCCTACCT 
TGTGCACATT 
CCTTCGTGTC 
TCATCAGCAG 
ATGAGGAGCT 
ACAGCATCAC 
ATGGCATCAC 



31 
I 

CCCTGCGCCG 
AGCAGCCGGG 
CCCGATCACC 
CTTCTGCAGC 
CAACCCCGAC 
CCTCTGGGCC 
TCAGATGACA 
CTGGGCAGAC 
GTTTCTGGTC 
GCTGGAGAGG 
CCTAGTGTGT 
CCAGGTGGAC 
GCTCGTCTTG 
TAATCCCTGC 
AGGGTTGATT 
CAAGGAGGAC 
CGCCAAGACT 
GAAAGAGAGT 
ACAGAAGGAG 
CCTCATGAGC 
CTTAAAGTTG 
CTACACCGTG 
CCACATCTGC 
GAAGGCCCTG 
CCTCATGTCT 
GTCAGCCCCC 
CGTCCTGGCT 
GAAGACCAAG 
GAACGAAATT 
GGACAAGGTG 
GTCAGCCGTG 
TGCCGTCTAC 
TTTGGCCTTG 
CATCGTGCAG 
GGAACCTGAC 
CGTGAGGAAT 
CTTCTCCATC 



41 



CCCGATCACC 
CGCCGCCCGG 
GCCGATGGCT 
TTCACCAAGT 

CCTCTCAACA 
CTCTTCTACT 
AGCCCAACTC 
AGGAAGGGAG 
GCCCTAGCCA 
CTGTTTCGTG 
TCCTGTTTCT 
CCAGAGTCCA 
GTCCGGGGCT 
ACGTCGGAAC 
AGGAAGCAGC 
TCCAAGGTGG 
TGGAACCCCT 
TTCTTCTTCA 
CTCATCAAGT 
CTGCTGTTTG 
TTCGTCAGTG 
GTGATCACCA 
GTGGACGCTC 
CTGCAAGTCA 
GGAGTGGCGG 
ACGTATCAGG 
CTCAATGGGA 
CTGGCCATCA 
GGCACCTTCA 
GTGACCATTG 
TTCAACATCC 
GCGAGTGTCT 
AGCATCGAGC 
GCCACATTCA 
CCCGAAGGTG 



51 
I 

CGCCGCCGCC 
CGCCGCCCGG 
TGCCCGCCGC 
CCGACCCGCT 
GCTTTCAGAA 
TCTACTTCCT 
AAACCAAAAC 
CTTTCTGGGA 
TCTTGGGCAT 
TTCAGTCTTC 
TCCTGAGATC 
ACATCACTTT 
CAGATCGCTC 
GCGCTTCCTT 
ACCGCCAGCC 
AAGTCGTGCC 
CGGTGAAGGT 
ATGCGAATGA 
CTCTGTTTAA 
AGGCCATCCA 
TCGTGAATGA 
TCACTGCCTG 
GCATGAGGAT 
ATTCAGCCAG 
AGAGGTTCAT 
TCCTTGCTCT 
TGATGGTCCT 
TGGCCCACAT 
TCAAAGTGCT 
GGCAGGAGGA 
CCTGGGTCTG 
ACGAGAACAA 
TCCGGTTTCC 
CCCTCAAACG 
GACGGCCTGT 
CCTGGGCCAG 
CTTTGGTGGC 



1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 



60 
120 
180 
240 
300 
360 
420 
480 
540 



SEQ ID N0:21 0BH2 DNA SEQUENCE 

Nucleic Acid Accession*: L05628 

Coding sequence: 197-4792 (underlined sequences correspond to start and stop codons) 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1S00 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 



308 
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CGTGGTGGGC CAGGTGGGCT GCGGAAAGTC GTC C CTGCTC TCAGCCCTCT TGGCTGAGAT 2280 

GCACAAAGTG GAGGGGCACG TGGCTATCAA GGGCTCCGTG GCCTATGTGC CACAGCAGGC 2340 

CTCGATTCAG AATGATTCTC TCCGAGAAAA CATCCTTTTT GGATGTCAGC TGGAGGAACC 2400 

ATATTACAGG TCCGTGATAC AGGCCTGTGC CCTCCTCCCA GACCTGGAAA TCCTGCCCAG 2460 

TGGGGATCGG ACAGAGATTG GCGAGAAGGG CGTGAACCTG TCTGGGGGCC AGAAGCAGCG 2520 

CGTGAGCCTG GCCCGGGCCG TGTACTCCAA CGCTGACATT TACCTCTTCG ATGATCCCCT 2580 

CTCAGCAGTG GATGCCCATG TGGGAAAACA CATCTTTGAA AATGTGATTG GCCCCAAGGG 2640 

GATGCTGAAG AACAAGACGC GGATCTTGGT CACGCACAGC ATGAGCTACT TGCCGCAGGT 2700 

GGACGTCATC ATCGTCATGA GTGGCGGCAA GATCTCTGAG ATGGGCTCCT ACCAGGAGCT 2760 

GCTGGCTCGA GACGGCGCCT TCGCTGAGTT CCTGCGTACC TATGCCAGCA CAGAGCAGGA 2820 

GCAGGATGCA GAGGAGAACG GGGTCACGGG CGTCAGCGGT CCAGGGAAGG AAGCAAAGCA 2880 

AATGGAGAAT GGCATGCTGG TGACGGACAG TGCAGGGAAG CAACTGCAGA GACAGCTCAG 2940 

CAGCTCCTCC TCCTATAGTG GGGACATCAG CAGGCACCAC AACAGCACCG CAGAACTGCA 3000 

GAAAGCTGAG GCCAAGAAGG AGGAGACCTG GAAGCTGATG GAGGCTGACA AGGCGCAGAC 3060 

AGGGCAGGTC AAGCTTTCCG TGTACTGGGA CTACATGAAG GCCATCGGAC TCTTCATCTC 3120 

CTTCCTCAGC ATCTTCCTTT TCATGTGTAA CCATGTGTCC GCGCTGGCTT CCAACTATTG 3180 

GCTCAGCCTC TGGACTGATG ACCCCATCGT CAACGGGACT CAGGAGCACA CGAAAGTCCG 3240 

GCTGAGCGTC TATGGAGCCC TGGGCATTTC ACAAGGGATC GCCGTGTTTG GCTACTCCAT 3300 

GGCCGTGTCC ATCGGGGGGA TCTTGGCTTC CCGCTGTCTG CACGTGGACC TGCTGCACAG 3360 

CATCCTGCGG TCACCCATGA GCTTCTTTGA GCGGACCCCC AGTGGGAACC TGGTGAACCG 3420 

CTTCTCCAAG GAGCTGGACA CAGTGGACTC CATGATCCCG GAGGTCATCA AGATGTTCAT 3480 

GGGCTCCCTG TTCAACGTCA TTGGTGCCTG CATCGTTATC CTGCTGGCCA CGCCCATCGC 3540 

CGCCATCATC ATCCCGCCCC TTGGCCTCAT CTACTTCTTC GTCCAGAGGT TCTACGTGGC 3600 

TTCCTCCCGG CAGCTGAAGC GCCTCGAGTC GGTCAGCCGC TCCCCGGTCT ATTCCCATTT 3660 

CAACGAGACC TTGCTGGGGG TCAGCGTCAT TCGAGCCTTC GAGGAGCAGG AGCGCTTCAT 3720 

CCACCAGAGT GACCTGAAGG TGGACGAGAA CCAGAAGGCC TATTACCCCA GCATCGTGGC 3780 

CAACAGGTGG CTGGCCGTGC GGCTGGAGTG TGTGGGCAAC TGCATCGTTC TGTTTGCTGC 3840 

CCTGTTTGCG GTGATCTCCA GGCACAGCCT CAGTGCTGGC TTGGTGGGCC TCTCAGTGTC 3900 

TTACTCATTG CAGGTCACCA CGTACTTGAA CTGGCTGGTT CGGATGTCAT CTGAAATGGA 3960 

AACCAAC ATC . GTGGCCGTGG AGAGGCTCAA GGAGTATTCA GAGACTGAGA AGGAGGCGCC 4020 

CTGGCAAATC CAGGAGACAG CTCCGCCCAG CAGCTGGCCC CAGGTGGGCC GAGTGGAATT 4080 

CCGGAACTAC TGCCTGCGCT ACCGAGAGGA CCTGGACTTC GTTCTCAGGC ACATCAATGT 4140 

CACGATCAAT GGGGGAGAAA AGGTCGGCAT CGTGGGGCGG ACGGGAGCTG GGAAGTCGTC 4200 

CCTGACCCTG GGCTTATTTC GGATCAACGA GTCTGCCGAA GGAGAGATCA TCATCGATGG 4260 

CATCAACATC GCCAAGATCG GCCTGCACGA CCTCCGCTTC AAGATCACCA TCATCCCCCA 4320 

GGACCCTGTT TTGTTTTCGG GTTCCCTCCG AATGAACCTG GACCCATTCA GCCAGTACTC 4380 

GGATGAAGAA GTCTGGACGT CCCTGGAGCT GGCCCACCTG AAGGACTTCG TGTCAGCCCT 4440 

TCCTGACAAG CTAGACCATG AATGTGCAGA AGGCGGGGAG AACCTCAGTG TCGGGCAGCG 4500 

CCAGCTTGTG TGCCTAGCCC GGGCCCTGCT GAGGAAGACG AAGATCCTTG TGTTGGATGA 4560 

GGCCACGGCA GCCGTGGACC TGGAAACGGA CGACCTCATC CAGTCCACCA TCCGGACACA 4620 

GTTCGAGGAC TGCACCGTCC TCACCATCGC CCACCGGCTC AACACCATCA TGGACTACAC 4680 

AAGGGTGATC GTCTTGGACA AAGGAGAAAT CCAGGAGTAC GGCGCCCCAT CGGACCTCCT 4740 

GCAGCAGAGA GGTCTTTTCT ACAGCATGGC CAAAGACGCC GGCTTGGT GT GAG CCCCAGA 4800 

GCTGGCATAT CTGGTCAGAA CTGCAGGGCC TATATGCCAG CGCCCAGGGA GGAGTCAGTA 4860 

CCCCTGGTAA ACCAAGCCTC CCACACTGAA ACCAAAACAT AAAAACCAAA CCCAGACAAC 4920 

CAAAACATAT TCAAAGCAGC AGCCACCGCC ATCCGGTCCC CTGCCTGGAA CTGGCTGTGA 4980 
AGACCCAGGA GAGACAGAGA TGCGAACCAC C 



1 11 21 31 41 51 

I I I I ! I 

MALRGFCSAD GSDPLWDWNV TWNTSNPDFT KCFQNTVLVW VPCFYLWACF PFYFLYLSRH 60 

DRGYIQMTPL NKTKTALGFL LWIVCWADLF YSFWERSRGI FLAPVFLVSP TLLGITTLLA 120 

TFLIQLERRK GVQSSGIMLT FWLVALVCAL AILRSRIMTA LKEDAQVDLP RDITFYVYFS 180 

LLLIQLVLSC FSDRSPLFSE TIHDPNPCPB SSASFLSRIT FWWITGLIVR GYRQPLEGSD 240 

LWSLNKEDTS EQWPVLVKN WKKECAKTRX QPVKWYSSK DPAQPKESSK VDANEEVEAL 300 

rVKSPQKEWN PSLFKVLYKT FGPYPLMSFF FKAIHDLMMF SGPQILKLLI KFVNDTKAPD 360 

WQGYFYTVLL FVTACLQTLV LHQYFHICFV SGMRIKTAVI GAVYRKALVI TNSARKSSTV 420 

GEIVNLHSVD AQRFMDLATY INMIWSAPLQ VILALYLLML NLGPSVLAGV AVHV/LKVPVN 480 

AVMAMKTKTY QVAHMKSKDN RIKLHMEILN GIKVLKLYAW ELAFKDKVLA IRQEELKVLK 540 

KSAYLSAVGT FTWVCTPFLV ALCTFAVYVT IDENNILDAQ TAFVSLALFN ILRFPLNILP 600 

MVISSIVQAS VSLKRLRIFL SHEELEPDSI ERRPVKDGGG TNSITVRNAT FTWARSDPPT 660 

LNGITFSIPE GALVAWGQV GCGKSSLLSA LLAEHDKVEG HVA1KGSVAY VPQQAWIQND 720 

SLRENILFGC OLEEPYYRSV IQACALLPDL EILPSGDRTE IGEKGVNLSG GQKQRVSLAR 780 

AVYSNADIYL FDDPLSAVDA HVGKHIFENV IGPKGMLKNK TRILVTHSMS YLPQVDVIIV 840 

MSGGKISEMG SYQELLARDG AFAEFLRTYA STEQEQDAEE NGVTGVSGPG KEAKQMENGH 900 

LVTDSAGKQL QRQLSSSSSY SGDISRHHNS TABLOKAEAK KEETWKLKEA DKAQTGQVKL 960 

SVYWDYMKAI GLFISFLSIP LFMCNHVSAL ASNYWLSLWT DDPIVNGTQE HTKVRLSVYG 1020 

AXiGISQGIAV FGYSMAVSIG GILASRCLHV DLLHSILRSP MSFFERTPSG NLVNRFSKEL 1080 

DTVDSMIPEV IKMFMGSLFN VIGACIVILL ATPIAAIIIP PLGLIYFFVQ RFYVASSRQL 1140 

KRLESVSRSP VYSHFNETLL GVSVIRAFEE QERPIHQSDL KVDENQKAYY PSIVANRWLA 1200 

VRLECVGNCI VLFAALFAVI SRHSLSAGLV GLSVSYSLQV TTYLNWLVRM SSEMETNIVA 1260 

VERLKEYSET EKEAPWQIQE TAPPSSWPQV GRVEFRNYCL RYREDLDFVL RHINVTINGG 1320 

EKVGIVGRTG AGXSSLTLGL FRIWESAEGE II IDG IN I AK IGI*HDLRPKI TUPQDPVLF 1380 

SGSLRKNLDP FSQYSDEEVW TSLELAHLKD FVSALPDKLD HECAEGGENL SVGQRQLVCL 1440 

ARALLRKTKI LVLDEATAAV DLETDDLIQS TIRTQPEDCT VLTIAHRLNT IMDYTRVIVL 1500 
DKGEIQBYGA PSDLLQQRGL FYSMAKDAGL V 



309 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



SEQ ID N023 PAA2 DNA SEQUENCE 

Nucleic Add Accession!: NM.013309 

Coding sequence: 1-1290 (underlined sequences correspond to start and stop codons) 



ATGGCCGGCT 
CTGTTTTTAA 
TCTCGGTTCA 
CCTGTTAACG 
TTACCTTTGA 
CAGAGAGAGA 
TACTTGCTTT 
ATGACAGATG 
TTGTGGCTAT 
GTTTTGTCAG 
GAAGCTGTGC 
ACCGCAGCTG 
CACCGTCACT 
GAACGTAACC 
GATTTGGTAC 
TACAAGATTG 
TTTCGAATCA 
GTAGACTATA 
AATATCTGGT 
GGAAGTTCAT 
TTTGGCATGT 
TGTGCAAATT 



11 
1 

CTGGCGCGTG 
ATGACACCAG 
ACAAACTTCG 
GGGCGCACCC 
CCAACAGTCA 
TACTGAAGCA 
TCATGATTGG 
CACTTCATAT 
CATCAAAATC 
CTATGATTAG 
AAAGAACTAT 
TTGGAGTTGC 
CCCATTCCCA 
ATGGGCAGGA 
AGAGTGTXGG 
CTGATCCCAT 
TATGGGATAC 
TCAAAGAAGC 
CTCTCACTTC 
CTAAATGGGA 
ATAGATGTAC 
GTCAGAGTTC 




CCATATGAAC 
AGTTAATGTA 
CTCCCTGCCT 
TAGCCTGGCA 
TGTGCTAATA 
CTGTACATAC 
AGTAGTTATA 
CTTGATGAAA 
AGGAAAATCT 
GGAAGTACAG 
TATTCAGCTT 
TAGTCCCTGA 



31 
I 

AAATCTATGC 
TTCTCGGATG 
GCCGATGACG 
GCCGACGATG 
AAGGTGGACT 
AAAGCCAGGT 
GGTGGATACA 
CTAAGCGCCA 
AGATTCACCT 
GTGTATATAC 
TATGAAATAA 
ATAATGGGGT 
TCAAATTCCC 
GTGAGAGCTG 
GCTGCATACA 
GTATTTTCAT 
ATACTAGAAG 
ATAGAAGATG 
ACTGCCATAG 
TCCAAAGCAA 
CAGAGTTACA 



41 
I 

TAAGGAAGGA 
AGGCGGGGGA 
GTTCCGAAGC 
ATTCCTTACT 
CCTGTGACAA 
TGACCATTGC 
TTGCAAATAG 
TCATACTCAC 
TTGGATTTCA 
TTATGGGATT 
ATGGAGATAT 
TTCTGTTGAA 
CTACCAGAGG 
CATTTGTACA 
TCATACGATT 
TACTTGTGGC 
GTGTGCCAAG 
TATATTCAGT 
TTCACATACA 
ACCATTTATT 
GGCAAGAAGT 



51 
I 

TGATGCGCCG 
CGAGGGGCTT 
CCCGGAAAGG 
GGACCAAGAC 
CTGCAGCAAA 
TGCCGTTCTG 
CCTAGCAATC 
CCTGCTTGCT 
TCGCTTAGAG 
CCTCTTATAT 
AATGCTCATC 
CCAGTCTGGT 
TTCTGGGTGT 
TGCTTTGGGA 
CAAGCCAGAA 
TTTTACAACA 
CCATTTCAAT 
CGAAGATTTA 
GCTAATTCCT 
ATTGAACACA 
GGACAGAACT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



SEQ ID N0:24 PAA2 Protein sequence: 
Protein Accession #: NP_037441 



MAGSGAWKRL 
PVNGAHPTLQ 
YLLFMIGELV 
VLSAMISVLL 
HRHSHSHSLP 
YKIADPICTY 
NIWSLTSGKS 
CANCQSSSP 



11 
1 

KSMLRKDDAP 
ADDDSLLDQD 
GGYIANSLA1 
VYILMGFLLY 
SNSPTRGSGC 
VFSLLVAFTT 
TAIVHIQLIP 



21 
I 

LFLNDTSAFD 
LPLTNSQLSL 
MTDALHMLTD 
EAVQRTIHMN 
ERNHGQDSLA 
FRIIWDTWI 
GSSSKWEEVQ 



31 
I 

FSDEAGDEGL 
KVDSCDNCSK 
LSAIILTLLA 
YEINGDIMLI 
VRAAFVHALG 
ILEGVPSHLN 
SKANHLLLNT 



41 
I 

.SRFNKLRVW 
QRBILKQRKV 
LWLSSKSPTK 
TAAVGVAVNV 
DLVQSVGVLI 
VDYIKEALMX 
FGMYRCTIQL 



51 
I 

ADDGSBAPER 
KARLTIAAVL 
RFTFGFHRLE 
IMGFLLNQSG 
AAYIIRFKPE 
IEDVYSVEDL 
QSYRQEVDRT 



1 
I 

GCCGAGTCGG 
AAGTGGTTCC 
CGCGTGGCGG 
AGGTCCCGGG 
ATTTGAAAGT 
AGTGTTGTCT 
AACTGCAGCT 
TAATGTGCAT 
ATTTTAGTAC 
ACTATGGAAT 
GAAAAGAAAA 
TCCCTACTGA 
TTTTTAGTGA 
TGAAAGGAAA 
GAGCAGTCAT 
AAATTGCCCT 
TTCATTGTAA 
CATTGACTAC 
AAGTTGCTGA 
TTTTTATTGT 
CTTGGCGTCT 
ACATTCCTCA 
TTTTGGTATT 
AGGAAATACA 
ATGAAGTGGC 



11 
1 

TGGCGGCTGC 
AGGCTACCCG 
CGGGAACTGT 
CAGATAACAT 
AGCAAAATAG 
TAGGAAACAG 
GATAATGTTT 
TTTTTACATG 
ATTGCAACCA 
TTCAGTTGCC 
GGATTTGATG 
CACCTTGTTT 
AGTGAAATAT 
AGCAAATATT 
GGAAGCCGGT 
TTTGGAAAGT 
ACTAGTCTTG 
ACTGAACATT 
AGATCCTCAA 
TAGCCAACAG 
TCTGGGAAAA 
AGATGCTAAT 
ACATGATGTT 
AGAAGATGAA 
AGAAACTGTT 



21 
I 

AGGCTGGGAG 
GCTAGTCTGG 
TGGCCGCGCG 
AGATCATCAG 
AAAATAAAGA 
AACACAGCAG 
TCCGGCTTCA 
CCAACAGTAA 
GGTCTTGAAG 
AAGGTTAATT 
AAAGCATATT 
GATGTGAATG 
ATTACCAACC 
ATATTCTCAT 
TTTGTGTATG 
ATTGGCTCTG 
GACTTGACCC 
CACCTGTTTA 
CAAGTTTCAA 
GCTACTTATG 
GCAGGAGTTC 
GTGGTCTTCA 
GATTTAATAA 
GACAATGACA 
TTCAGAGATA 



31 
I 

GGAGAAGTGC 
CACGGCCCCG 
GCCTCGGGAA 
TAGAAAACTT 
ATTAACAGCA 
TGAAAAAACA 
ATGTCTTTAG 
ACTCTTTACC 
AACTGAATGA 
GTGTCAAAGA 
TATTCAAGGG 
CCATTGTCGC 
TGGAAGACCT 
ATGTAAGAGC 
GGACTACATA 
AGGATGTGGA 
AGCAATGTAG 
TTAAGACAAT 
CTGTCCATCT 
AAGCTGATAG 
TACTCTTCTT 
AAAGAGCAGA 
TATCTCATGT 
TGGAAGGTCC 
GGAAGAGAAA 



41 
I 

TACGCCTTTG 
TCTTCTGCCT 
CGGCCCAGGT 
CTTGAAGTTG 
GATACAGAGG 
GACAAAATCC 
AGTTGGGATC 
AGAACTGAGT 
GGCTGTTAGA 
AGAAATATCA 
CAACATATTG 
CCATGTTCTC 
TCAGAACATA 
CATTGGAATA 
CCAATTTGTC 
ATATGCACAT 
AAGAACACTA 
GAAAGCACCT 
CCAACTGGGC 
AAGAACTGCA 
AAGGGACTCT 
AGAGGGAGTT 
GGAAAATAAT 
AGATATAGAT 
ATTACCTTTG 



51 
I 

CAGGTTGGCG 
CCTCCTCCGT 
CCCCGCCCGC 
TTCAAGAAAA 
ACAGCATGGA 
GCTCAGATAC 
TCTTTTGTCA 
CCTCAGAAAT 
CCTCTGCAGG 
AGATACTGTG 
CTCAGAGAAT 
TTTGCTCTTC 
GAAAATGCTC 
CCAGAGCACA 
TTAACCACAG 
CTCTACTTTT 
ATGGAACAGC 
CTGTTGACTG 
TTACCACTGG 
GAATGGGTTG 
TTGGAAGTGA 
CCAGTGGAAT 
ATGCACATTG 
GTTCAGGATG 
GAACTTACAG 



60 
120 
180 
240 
300 
360 
420 



SEQ ID N025 PAA3 DNA SEQUENCE 

Nudeic Add Accession #; AB037765 

Coding sequence: 375-2798 (underlined sequences correspond to start and stop codons) 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



310 



WO 02/30268 



TGGAACTAAC AGAAGAAACA TTTAATGCAA CAGTGATGGC TTCTGACAGC ATAGTACTCT 1560 

TCTATGCTGG TTGGCAAGCA GTATCCATGG CATTTTTGCA ATCCTATATT GATGTGGCAG 1620 

TTAAACTGAA AGGCACATCT ACTATGCTTC TTACTAGAAT AAACTGTGCA GATTGGTCTG 1680 

ATGTATGTAC TAAGCAAAAT GTTACTGAAT TTOCTATCAT AAAGATGTAC AAGAAAGGCG 1740 

AGAACCCAGT ATCTTATGCT GGAATGTTAG GAACCAAAGA TCTCCTAAAA TTTATCCAGC 1800 

TCAACAGGAT TTCATATCCA GTGAATATAA CATCGATCCA AGAAGCAGAA GAATATTTAA I860 

GTGGGGAATT ATATAAAGAC CTCATCTTGT ATTCTAGTGT GTCAGTATTG GGACTATTTA 1920 

GTCCAACCAT GAAAACAGCA AAAGAAGATT TTAGTGAAGC AGGAAACTAC CTAAAAGGAT 1980 

ATGTTATCAC TGGAATTTAT TCTGAAGAAG ATGTTTTGCT ACTGTCAACC AAATATGCTG 2040 

CAAGTCTTCC AGCCCTGCTG CTTGCCAGAC ACACAGAAGG CAAAATAGAG AGCATCCCAC 2100 

TAGCTAGCAC ACATGCACAA GACATAGTTC AAATAATAAC AGATGCACTA CTGGAAATGT 2160 

TTCCGGAAAT CACTGTGGAA AATCTTCCCA GTTATTTCAG ACTTCAGAAA CCATTATTGA 2220 

TTTTGTTCAG TGATGGCACT GTAAATCCTC AATATAAAAA AGCAATATTG ACACTGGTAA 2280 

AGCAGAAATA CTTGGATTCA TTTACTCCAT GCTGGTTAAA TCTAAAGAAT ACTCCAGTGG 2340 

GGAGAGGAAT CTTGCGGGCA TATTTTGATC CTCTGCCTCC CCTTCCTCTT CTTGTTTTGG 2400 

TGAATCTGCA TTCAGGTGGC CAAGTATTTG CATTTCCTTC AGACCAGGCT ATAATTGAAG 2460 

AAAACCTTGT ATTGTGGCTG AAGAAATTAG AAGCAGGACT AGAAAATCAT ATCACAATTT 2520 

TACCTGCTCA AGAATGGAAA CCTCCTCTTC CAGCTTATGA TTTTCTAAGT ATGATAGATG 2580 

CCGCAACATC TCAACGTGGC ACTAGGAAAG TTCCCAAGTG TATGAAAGAA ACAGATGTGC 2640 

AGGAGAATGA TAAGGAACAA CATGAAGATA AATCGGCAGT CAGAAAAGAA CCGATTGAAA 2700 

CTCTGAGAAT AAAGCATTGG AATAGAAGTA ATTGGTTTAA AGAAGCAGAA AAATCATTTA 2760 

GACGTGATAA AGAGTTAGGA TGCTCAAAAG TGAACTAATT TTATAGGGCT GTGGTTTCCA 2820 

AAATTTTTTT GGCATGATAG ACTTAATTTA TTTCCTTAAA GAATAATATT AAATCATTTC 2880 

AAGTTTGCAG ACTAGTGCCA TCCAATAGAA TTATAATATA AGTCACATAT TTTATTTAAA 2940 

ATTTTCTAGT AACTACATTA AACAAAGTAA AAGTGAGCAG GGCAAAATAA TTTTGATATT 3000 

ACTTTTCACC CAGTAGTATA CCCAAAATAG CGAAATATAG AAATTATTAA TGAGATATTT 3060 

TACATCCTTT TTTGTACCAA GTCTTCTAAA TGCAGTACAT ATTTTATACT TACTGCATTT 3120 

CTTACTTCCG AGTAGCCATA TTTCAAGTGT TCATTGCCAC ATGTGGCCTG TGACTACTGT 3180 

ATTGGACAGT TCAGTACTAG ACAAAAACTA GCATAATTAA CTTAGTTCTA GCCATGATTT 3240 

CTATTTGGAT TAAAATTAAA CTCTAATCAC AGTTAACTCC ACAGTCCATT CATGCAGCTG 3300 

ACAGTTATAT TTGTTTTATT GGAGTCATGA TATTAAAATC AGCGTTTGTC AACCTCAGGG 3360 

GATATTTAGC AATTGTCGGG AGACATTTTT GATGTCATGA CTAGGGCAGT TATTGACATT 3420 

TAGTGAGTAG AGGCCATGGA TCCTGCTAAA TAACCTGCAT TGGACAGCGC CCCACAACAA 3480 

AGAATTATCC TGCCCGAAAT GGTAGTCGTG CCAAGGCTGA GTAACCTTGT GTTAAAAGTA 3540 

ACCTGTGGCA GACTAGGTTT CCAGAATTTC CTGGTTCTGC TCACGTATCA TGTTTGAAAA 3600 

AATTTTGGCT ATTAAAGATA TGTATTAGAT GGTCTTATCC TGATTATTAC CTGGATACAA 3660 

CTTGATCTTT TCTAATATTT TCAGAAAGTG ATGGGATAAC CCTAGAAGAG GACTCAGAAT 3720 

GATATTTATA TTTTAAGTGA GTCTTAAAAC CTCCTCTTAT TTCTACAAGT TATATGGCTA 3780 

AATTTCAGAT TGAACAGGGA TTCAGCATTC TGCCATCTCC TCATGGAAAG AGAGGCTCCC 3840 

TCATCTGAAG CGTCTCTGAA ATCTACCCTT GCAAGCTTCA GACAAATCAG TTGATCTCCC 3900 

TGAGCCACAC GGCCTCATTC TGTGAGGGAG GGAAAGATTA GCCAAAGAGT TAATTTTCAT 3960 

TCCAAATCAC TTAGCTGTTA GACTGATCTG TTTGTAGCAG TTGTTTGTCT CATTTTTGCT 4020 

CTGTGCATTT TTTGAGACAT TTGTTGAGAA TATTCTATTT GGTGCTCTAC TGTATTTTTC 4080 

TTTTTAATAT CTACTTGATA TCTTGTTCTT TAAATTTTCT TCACATATGG TTTGCCTGAT 4140 

ACAACTGATT TTTATAACTG AAATTTAAGG AATCTAACAG CTAAAACTCA GTAAGTGCAT 4200 

MTATTTCCTT ATAACATAGA CCCGTTGCTA CTCTCAGCAC CCTCTCCTCA ATTTTTTTTC 4260 

CTGTAGCATG TGATGCCTGA TTAAACTCAT TTTCATTTGC TTTTATTTCT AATATGGGAA 4320 

CAATGAGAGT GAACTCTAAA TATAGGTTGT AGTAATAAAA CATCATTAGC CTAATTATTA 4380 

GAAAATGCTA ATTAAGTACC AGCACATAGA AACATGAAAT TGCTTAGTCA TTGTACCTTT 4440 

GTCAGCAATT TTGACAGTCA TTAATGTTTG TCATAATTTT AAATAAAGTG TCTGGGTTTC 4500 
AGAATACCTT CAAAAAAAAA AAAAAA 

SEQ ID N036 PAA3 pro^n, SWV^ 
Protein Accession #: 6AA92562 

1 11 21 31 41 51 

I I I I I I 

KFSGFNVFRV GISFVIMCIF YMPTVNSLPE LSPQKYFSTL QPGLEELNEA VRPLQDYGIS 60 

VAKVNCVKEB ISRYCGKEKD LMKAYLFKGN ILLREFPTDT LFDVNAIVAH VLFALLFSEV 120 

KYITNLBDLQ NIENALKGKA NIIPSYVRAI GIPEHRAVME AGFVYGTTYQ FVLTTBIALL 180 

ESIGSEDVBY AHLYFFHCKL VLDLTQQCRR TLMEQPLTTL NIHLFIKTMK APLLTEVAED 240 

PQQVSTVHLQ LGLPLVFIVS QQATYEADRR TAEWVAWRLL GXAGVLLLLR DSLEVNIPQD 300 

ANWFKRAEE GVPVEFLVLH DVDLIISHVE NNMHIEEIQE DEDNDHEGPD IDVQDDEVAE 360 

TVFRDRKRKL PLELTVELTE ETFHATVMAS DSIVLFYAGW QAVSMAFLQS YIDVAVKLKG 420 

TSTKLLTRIN CADWSDVCTK QNVTEFPIIK MYKKGENPVS YAGMLGTKDL LKFIQLNRIS 480 

YPVNITSIQE AEEYLSGELY KDLILYSSVS VLGLFSPTMK TAKEDFSEAG NYLKGYVITG 540 

IYSEEDVLLI* STKYAASLPA LLLARHTEGK IESIPLASTH AQDIVQIITD ALLEMFPEIT 600 

VENLPSYFRL QKPLLILFSD GTVNPQYKKA ILTLVKQKYL DSFTPCWLNL KNTPVGRGIL 660 

RAYFDPLPPL PLLVLVNLHS GGQVFAFPSD OAIIEENLVL WLKKLEAGLE NHITILPAQE 720 

WKPPLPAYDF LSMIDAATSQ RGTRKVPKCM KETDVQENDK EQHEDKSAVR KEPIETLRIK 780 
HWNRSNWFKE AEK5FRRDKE LGCSKVN 

SEQ 10 NCh27 PAA5 DMA SEQUENCE 

Nucleic Add Accession #: NM_0 12449 

Coding sequence: 66-1085 {undefined sequences corresponc* lo start and stop colons) 

.1 11 21 31 41 51 

I I I I I I 

CCGAGACTCA CGGTCAAGCT AAGGCGAAGA GTGGGTGGCT GAAGCCATAC TATTTTATAG 60 

AATTAATGGA AAGCAGAAAA GACATCACAA ACCAAGAAGA ACTTTGGAAA ATGAAGCCTA 120 
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GGAGAAATTT AGAAGAAGAC GATTATTTGC ATAAGGACAC GGGAGAGACC AGCATGCTAA 180 

AAAGACCTGT GCTTTTGCAT TTGCACCAAA CAGCCCATGC TGATGAATTT GACTGCCCTT 240 

CAGAACTTCA GCACACACAG GAACTCTTTC CACAGTGGCA CTTGCCAATT AAAATAGCTG 300 

CTATTATAGC ATCTCTGACT TTTCTTTACA CTCTTCTGAG GGAAGTAATT CACCCTTTAG 360 

CAACTTCCCA TCAACAATAT TTTTATAAAA TTCCAATCCT GGTCATCAAC AAAGTCTTGC 420 

CAATGGTTTC CATCACTCTC TTGGCATTGG TTTACCTGCC AGGTGTGATA GCAGCAATTG 480 

TCCAACTTCA TAATGGAACC AAGTATAAGA AGTTTCCACA TTGGTTGGAT AAGTGGATGT 540 

TAACAAGAAA GCAGTTTGGG CTTCTCAGTT TCTTTTTTGC TGTACTGCAT GCAATTTATA 600 

GTCTGTCTTA CCCAATGAGG CGATCCTACA GATACAAGTT GCTAAACTGG GCATATCAAC 660 

AGGTCCAACA AAATAAAGAA GATGCCTGGA TTGAGCATGA TGTTTGGAGA ATGGAGATTT 720 

ATGTGTCTCT GGGAATTGTG GGATTGGCAA TACTGGCTCT GTTGGCTGTG ACATCTATTC 780 

CATCTGTGAG TGACTCTTTG ACATGGAGAG AATTTCACTA TATTCAGAGC AAGCTAGGAA 840 

TTGTTTCCCT TCTACTGGGC ACAATACACG CATTGATTTT TGCCTGGAAT AAGTGGATAG 900 

ATATAAAACA ATTTGTATGG TATACACCTC CAACTTTTAT GATAGCTGTT TTCCTTCCAA 960 

TTGTTGTCCT GATATTTAAA AGCATACTAT TCCTGCCATG CTTGAGGAAG AAGATACTGA 1020 

AGATTAGACA TGGTTGGGAA GACGTCACCA AAATTAACAA AACTGAGATA TGTTCCCAGT 1080 

TGTAGAATTA CTGTTTACAC ACATTTTTGT TCAATATTGA TATATTTTAT CACCAACATT 1140 
TCAAGTTTGT ATTTGTTAAT AAAATGATTA TTCAAGGAAA AAAAAAAAAA AAAAA 

SEQ ID N0:28 PAA5 Protein sequence 
Protein Accession #: NP.03658 1 

1 11 21 31 41 51 

I I I I I I 

MESRKDITNQ EELWKMKPRR NLEEDDYLHK DTGETSMLKR PVLLHLHQTA HADEFDCPSE 60 

LQHTQELFPQ WHLPIKIAAI IASLTFLYTL LREV1HPLAT SHQQYFYXIP ILVINKVLPM 120 

VSITLLALVY LPGVIAAIVQ LHNGTKYKKF PHWLDKWMLT RKQFGLLSFF FAVLHAIYSL 180 

SYPMHRSYRY KLLNWAYQQV QQNKEDAWIE HDVWRMEIYV SLGIVGLAIL ALLAVTSIPS 240 

VSDSLTWREP HYIQSKLGIV SLLLGTIHAL IFAWNKWIDI KQFVWYTPPT FMIAVFLPIV 300 
VLIFKSILFL PCLRKKILKI RHGWEDVTKI NKTEICSQL 

SEQ ID NO:29 PAA7 0NA SEQUENCE 

Nudelc Arid Accession* NMJ)30774 

Coding sequence: 1-963 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I 1 I I t I 

ATGAGTTCCT GCAACTTCAC ACATGCCACC TTTGTGCTTA TTGGTATCCC AGGATTAGAG 60 

AAAGCCCATT TCTGGGTTGG CTTCCCCCTC CTTTCCATGT ATGTAGTGGC AATGTTTGGA 120 

AACTGCATCG TGGTCTTCAT CGTAAGGACG GAACGCAGCC TGCACGCTCC GATGTACCTC 180 

TTTCTCTGCA TGCTTGCAGC CATTGACCTG GCCTTATCCA CATCCACCAT GCCTAAGATC 240 

CTTGCCCTTT TCTGGTTTGA TTCCCGAGAG ATTAGCTTTG AGGOCTGTCT TACCCAGATG 300 

TTCTTTATTC ATGCCCTCTC AGCCATTGAA TCCACCATCC TGCTGGCCAT GGCCTTTGAC 360 

CGTTATGTGG CCATCTGCCA CCCACTGCGC CATGCTGCAG TGCTCAACAA TACAGTAACA 420 

GCCCAGATTG GCATCGTGGC TGTGGTCCGC GGATCCCTCT TTTTTTT C CC ACTGCCTCTG 480 

CTGATCAAGC GGCTGGCCTT CTGCCACTCC AATGTCCTCT CGCACTCCTA TTGTGTCCAC 540 

CAGGATGTAA TGAAGTTGGC CTATGCAGAC ACTTTGCCCA ATGTGGTATA TGGTCTTACT 600 

GCCATTCTGC TGGTCATGGG CGTGGACGTA ATGTTCATCT CCTTGTCCTA TTTTCTGATA 660 

ATACGAACGG TTCTGCAACT GCCTTCCAAG TCAGAGCGGG CCAAGGCCTT TGGAACCTGT 720 

GTGTCACACA TTGGTGTGGT ACTCGCCTTC TATGTGCCAC TTATTGGCCT CTCAGTGGTA 780 

CACCGCTTTG GAAACAGCCT TCATCCCATT GTGCGTGTTG TCATGGGTGA CATCTACCTG 840 

CTGCTGCCTC CTGTCATCAA TCCCATCATC TATGGTGCCA AAACCAAACA GATCAGAACA 900 

CGGGTGCTGG CTATGTTCAA GATCAGCTGT GACAAGGACT TGCAGGCTGT GGGAGGCAAG 960 

TGACCCTTAA CACTACACTT CTCCTTATCT TTATTGGCTT GATAAACATA ATTATTTCTA 1020 

ACACTAGCTT ATTTCCAGTT GCCCATAAGC ACATCAGTAC TTTTCTCTGG CTGGAATAGT 1080 

AAACTAAAGT ATGGTACATC TACCTAAAGG ACTATTATGT GGAATAATAC ATACTAATGA 1140 

AGTATTACAT GATTTAAAGA CTACAATAAA ACCAAACATG CTTATAACAT TAAGAAAAAC 1200 

AATAAAGATA CATGATTGAA ACCAAGTTGA AAAATAGCAT ATGCCTTGGA GGAAATGTGC 1260 

TCAAATTACT AATGATTTAG TGTTGTCCCT ACTTTCTCTC TCTTTTTTCT TTCTTTTTTT 1320 

TTTATTATGG TTAGCTGTCA CATACAACTT TTTTTTTTTT TGAGATGGGG TCTCGCTCTG 1380 

TCACCAGGCT GGAGTGCAGT GGCGCGATCT CGGCTCACTG CAACCTCCAC ATCCCATGTT 1440 

GAAGTAATTC TTCTGCCTCA GCCTCCCGAG TAGCTGGGAC TAGAGGAACG TGCCACCATG 1500 

ACTGGCTAAT TTTCTGTATT TTTTAGTAGA GACAGAGTTT CACCATGTTG GCCAGGATGG 1560 

TCTCGATCTC CTGACCTTGT GATCCACCCG CCTCAGCCTC CCAAAGTGTT GGGATTACAG 1620 

GTGTGAACCA CTGTGCCCGG CCTGTGTACA ACTTTTTAAA TAGGGAATAT GATAGCTTCG 1680 

CATGGTGGTG TGCACCTATA GCCCCCACTG CCTGGAAAGC TGAGGTGGGA GAATCGCTTG 1740 

AGTCCAGGAG TTTGAGGTTA CAGTGATCCA CGATCGTACC ACTACACTCC AGCCTGGGCA 1800 

ACAGAGCAAG ACCCTGTCTC AAAGCATAAA ATGGAATAAC ATATCAAATG AAACAGGGAA 1860 

AATGAAGCTG ACAATTTATG GAAGCCAGGG CTTGTCACAG TCTCTACTGT TATTATGCAT 1920 

TACCTGGGAA TTTATATAAG CCCTTAATAA TARTGCCAAT GAACATCTCA TGTGTGCTCA 1980 

CAATGTTCTG GCACTATTAT AAGTGCTTCA CAGGTTTTAT GTGTTCTTCG TAACTTTATG 2040 

GAGTAGGTAC CATTTGTGTC TCTTTATTAT AAGTGAGAGA AATGAAGTTT ATATTATCAA 2100 

GGGGACTAAA GTCACACGGC TTGTGGGCAC TGTGCCAAGA TTTAAAATTA AATTTGATGG 2160 

TTGAATACAG TTACTTAATG ACCATGTTAT ATTGCTTCCT GTGTAACATC TGCCATTTAT 2220 

TTCCTCAGCT GTACAAATCC TCTGTTTTCT CTCTGTTACA CACTAACATC AATGGCTTTG 2280 

TACTTGTGAT GAGAGATAAC CTTGCCCTAG TTGTGGGCAA CACATGCAGA ATAATCCTGT 2340 

TTTACAGCTG CCTTTCGTGA TCTTATTGCT TGCTTTTTTC CAGATTCAGG GAGAATGTTG 2400 

TTGTCTATTT GTCTCTTACA TCTCCTTGAT CATGTCTTCA TTTTTTAATG TGCTCTGTAC 2460 

CTGTCAAAAA TTTTGAATGT ACACCACATG CTATTGTCTG AACTTGAGTA TAAGATAAAA 2520 
TAAAATTTTA TTTTAAATTT T 
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SEP ID NO:30 PAA7 PROTEIN SEQUENCE 

Protein Accession*: NP_1 10401 

l 11 21 31 41 51 

I I I I I I 

MSSCNFTHAT FVLIGIPGLE KAHFWVGFPL LSMYWAMFG NCIWFIVRT ERSLHAPMYL 60 

FLCMLAA1DL ALSTSTMPKI IALFWFDSRE ISPEACLTQM FFIHALSAIE STILLAMAFD 120 

RYVAICHPLR HAAVLNNTVT AQIGIVAWR GSLPFPPLPL LIKRLAFCHS NVLSHSYCVH 180 

QDVKKLAYAD TLPNWYGLT AILLVMGVDV MFISI*SYPLI IRTVLQLPSK SERAKAFGTC 240 

VSHIGWLAF YVPLIGLSW HRFGNSLHPI VRWKGDIYL LLPPVINPII YGAXTKQIRT 300 
RVLAKFKISC DKDLQAVGGK 



- SEQ ID N0-.31 PAV6 DNA SEQUENCE 

IJ Nudeic Add Accession*: XM.050837 



Coding sequence: 1-1 020 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I ! I I I I 

ATGAACTGGG AGCTGCTGCT GTGGCTGCTG GTGCTGTGCG CGCTGCTCCT GCTCTTGGTG 60 

CAGCTGCTGC GCTTCCTGAG GGCTGACGGC GACCTGACGC TACTATGGGC CGAGTGGCAG 120 

GGACGACGCC CAGAATGGGA GCTGACTGAT ATGGTGGTGT GGGTGACTGG AGCCTCGAGT 180 

GGAATTGGTG AGGAGCTGGC TTACCAGTTG TCTAAACTAG GAGTTTCTCT TGTGCTGTCA 240 

GCCAGAAGAG TGCATGAGCT GGAAAGGGTG AAAAGAAGAT GCCTAGAGAA TGGCAATTTA 300 

AAAGAAAAAG ATATACTTGT TTTGCCCCTT GACCTGACCG ACACTGGTTC CCATGAAGCG 360 

GCTACCAAAG CTGTTCTCCA GGAGTTTGGT AGAATCGACA TTCTGGTCAA CAATGGTGGA 420 

ATGTCCCAGC GTTCTCTGTG CATGGATACC AGCTTGGATG TCTACAGAAA GCTAATAGAG 480 

CTTAACTACT TAGGGACGGT GTCCTTGACA AAATGTGTTC TGCCTCACAT GATCGAGAGG 540 

AAGCAAGGAA AGATTGTTAC TGTGAATAGC ATCCTGGGTA TCATATCTGT ACCTCTTTCC 600 

ATTGGATACT GTGCTAGCAA GCATGCTCTC CGGGGTTTTT TTAATGGCCT TCGAACAGAA 660 

CTTGCCACAT ACCCAGGTAT AATAGTTTCT AACATTTGCC CAGGACCTGT GCAATCAAAT 720 

ATTGTGGAGA ATTCCCTAGC TGGAGAAGTC ACAAAGACTA TAGGCAATAA TGGAGACCAG 780 

TCCCACAAGA TGACAACCAG TCGTTGTGTG CGGCTGATGT TAATCAGCAT GGCCAATGAT 840 

TTGAAAGAAG TTTGGATCTC AGAACAACCT TTCTTGTTAG TAACATATTT GTGGCAATAC 900 

ATGCCAACCT GGGCCTGGTG GATAACCAAC AAGATGGGGA AGAAAAGGAT TGAGAACTTT 960 
AAGAGTGGTG TGGATGCAGA CTCTTCTTAT TTTAAAATCT TTAAGACAAA ACATGACTGA 

SEQ ID Nk32 PAV6 Protein sequence 
Protein Accession*: XP_050837 



1 11 21 31 41 51 

I I I I I I 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTGASS 60 

GIGEELAYQL. SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDI LVLPL DLTDTGSHEA 120 

45 ATKAVLQEFG RIDILVNNGG MSQRSLCMDT SLDVYRKLIE LNYLGTVSLT KCVLPHMIER 180 

KQGKIVTVNS ILGIISVPLS IGYCASKHAL RGFFNGLRTE LATYPGIIVS NICPGPVQSN 240 

rVENSLAGEV TKTIGNNGDQ SHKMTTSRCV RLMLISMAND LKEVWISEQP FLLVTYLWQY 300 
MPTWAWWITN KMGKKRIENF K5GVDADSSY FKJQFKTKHD 

50 SEQ ID NO:33 PBA6 DNA SEQUENCE 

Nucleic Add Accession*: NM_006853 

Coding sequence: 26*874 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

55 | i | | | | 

AGGAATCTGC GCTCGGGTTC CGCAGATGCA GAGGTTGAGG TGGCTGCGGG ACTGGAAGTC 60 

ATCGGGCAGA GGTCTCACAG CAGCCAAGGA ACCTGGGGCC CGCTCCTCCC CCCTCCAGGC 120 

CATGAGGATT CTGCAGTTAA TCCTGCTTGC TCTGGCAACA GGGCTTGTAG GGGGAGAGAC 180 

^ CAGGATCATC AAGGGGTTCG AGTGCAAGCC TCACTCCCAG CCCTGGCAGG CAGCCCTGTT 240 

OU CGAGAAGACG CGGCTACTCT GTGGGGCGAC GCTCATCGCC CCCAGATGGC TCCTGACAGC 300 

AGCCCACTGC CTCAAGCCCC GCTACATAGT TCACCTGGGG CAGCACAACC TCCAGAAGGA 360 

GGAGGGCTGT GAGCAGACCC GGACAGCCAC TGAGTCCTTC CCCCACCCCG GCTTCAACAA 420 

CAGCCTCCCC AACAAAGACC ACCGCAATGA CATCATGCTG GTGAAGATGG CATCGCCAGT 480 

CTCCATCACC TGGGCTGTGC GACCCCTCAC CCTCTCCTCA CGCTGTGTCA CTGCTGGCAC 540 

□5 CAGCTGCCTC ATTTCCGGCT GGGGCAGCAC GTCCAGCCCC CAGTTACGCC TGCCTCACAC 600 

CTTGCGATGC GCCAACATCA CCATCATTGA GCACCAGAAG TGTGAGAACG CCTACCCCGG 660 

CAACATCACA GACACCATGG TGTGTGCCAG CGTGCAGGAA GGGGGCAAGG ACTCCTGCCA 720 

GGGTGACTCC GGGGGCCCTC TGGTCTGTAA CCAGTCTCTT CAAGGCATTA T CT CCTGGGG 780 

nf . CCAGGATCCG TGTGCGATCA CCCGAAAGCC TGGTGTCTAC ACGAAAGTCT GCAAATATGT 840 

70 GGACTGGATC CAGGAGACGA TGAAGAACAA TTAGACTGGA CCCACCCACC ACAGCCCATC 900 

ACCCTCCATT TCCACTTGGT GTTTGGTTCC TGTTCACTCT GTTAATAAGA AACCCTAAGC 960 

CAAGACCCTC TACGAACATT CTTTGGGCCT CCTGGACTAC AGGAGATGCT GTCACTTAAT 1020 

AATCAACCTG GGGTTCGAAA TCAGTGAGAC CTGGATTCAA ATTCTGCCTT GAAATATTGT 1080 

nc GACTCTGGGA ATGACAACAC CTGGTTTGTT CTCTGTTGTA TCCCCAGCCC CAAAGACAGC 1140 

75 TCCTGGCCAT ATATCAAGGT TTCAATAAAT ATTTGCTAAA TGAGTG 

SEP ID NO;34 PBA6 PROTEIN SEQUENCE 

Protein Accession I: NP.006844 
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1 11 21 31 41 51 

I I I I I I 

MR 1 1X3 L ILIA LATGLVGGET RXIKGFECKP HSQPKQAALF EKTRLLCGAT LIAPRWLLTA 60 

5 AHCLKPRYIV HLGQHNLQKE EGCEQTRTAT ESFPHPGFNN SLPNKDHRND IMLVKMASPV 120 

SITWAVRPLT LSSRCVTAGT SCLISGWGST SSPQLRLPHT LRCANITIIE HQKCENAYPG 180 

NITDTMVCAS VQEGGKDSCQ GDSGGPLVCN QSLQGIISWG QDPCAITRKP GVYTKVCKYV 240 
DWIQETKKNN 

1 0 SEO ID NO:35 PBC1 DMA SEQUENCE 

Nudeic Acid Accession*: NMJXH775 

Coding sequence: 70-972 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

15 | | | | | | 

CTAAAGCTCT CTTGCTGCCT AGCCTCCTGC CGGCCTCATC TTCGCCCAGC CAACCCCGCC 60 

TGCAGCCC TA TG GCCAACTG CGAGTTCAGC CCGGTGTCCG GGGACAAACC CTGCTGCCGG 120 

CTCTCTAGCA GAGCCCAACT CTGTCTTGGC GTCAGTATCC TGGTCCTGAT CCTCGTCGTG 180 

GTGCTCGCGG TGGTCGTCCC GAGGTGGCGC CAGACGTGGA GCGGTCCGGG CACCACCAAG 240 

20 CGCTTTCCCG AGACCGTCCT GGCGCGATGC GTCAAGTACA CTGAAATTCA TCCTGAGATG 300 

AGACATCTAG ACTGCCAAAG TGTATGGGAT GCTTTCAAGG GTGCATTTAT TTCAAAACAT 360 

CCTTGCAACA TTACTGAAGA AGACTATCAG CCACTAATGA AGTTGGGAAC TCAGACCGTA 420 

CCTTGCAACA AGATTCTTCT TTGGAGCAGA ATAAAAGATC TGGCCCATCA GTTCACACAG 480 

GTCCAGCGGG ACATGTTCAC CCTGGAGGAC ACGCTGCTAG GCTACCTTGC TGATGACCTC 540 

25 ACATGGTGTG GTGAATTCAA CACTTCCAAA ATAAACTATC AATCTTGCCC AGACTGGAGA 600 

AAGGACTGCA GCAACAACCC TGTTTCAGTA TTCTGGAAAA CGGTTTCCCG CAGGTTTGCA 660 

GAAGCTGCCT GTGATGTGGT CCATGTGATG CTCAATGGAT CCCGCAGTAA AATCTTTGAC 720 

AAAAACAGCA CTTTTGGGAG TGTGGAAGTC CATAATTTGC AACCAGAGAA GGTTCAGACA 780 

CTAGAGGCCT GGGTGATACA TGGTGGAAGA GAAGATTCCA GAGACTTATG CCAGGATCCC 840 

30 ACCATAAAAG AGCTGGAATC GATTATAAGC AAAAGGAATA TTCAATTTTC CTGCAAGAAT 900 

ATCTACAGAC CTGACAAGTT TCTTCAGTGT GTGAAAAATC CTGAGGATTC ATCTTGCACA 960 

TCTGAGATCT GAG CCAGTCG CTGTGGTTGT TTTAGCTCCT TGACTCCTTG TGGTTTATGT 1020 

CATCATACAT GACTCAGCAT ACCTGCTGGT GCAGAGCTGA AGATTTTGGA GGGTCCTCCA 1080 

CAATAAGGTC AATGCCAGAG ACGGAAGCCT. TTTTCCCCAA AGTCTTAAAA TAACTTATAT 1140 

35 CATCAGCATA CCTTTATTGT GATCTATCAA TAGTCAAGAA AAATTATTGT ATAAGATTAG 1200 
AATGAAAATT GTATGTTAAG TTACTTCCTT TAG 
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SEO ID NO:36 PBC1 Protein sequence 
Protein Accession*: NP_001766 



11 21 31 41 51 

I I I I ! I 

MANCEFSPVS GDKPCCRLSR RAQLCLGVSI LVLILVWLA VWPRWRQTW SGPGTTKRPP 60 
ETVLARCVKY TEIHPEMRHV DCQSVWDAFK GAPISKHPCN 1TEEDYQPLH KLGTQTVPCN 120 
45 KILLWSRIKD LAH0FTQVQR DMFTLEDTLL GYLADDLTWC GEFNTSKINY QSCPDVIRKDC 180 
SNNPVSVFWK TVSRRFAEAA CDWHVMLNG SRSKIFDKNS TFGSVEVHNL QPEKVQTLEA 240 
WV1HGGREDS RDLCQDPTIK ELESIISKRN IQFSCKNIYR PDKPLQCVKN PEDSSCTSEI 

SEOIDNO:37PBH1 DNA SEQUENCE 

50 Nuctetc Add Accession*: XM.017718 

Codmg sequence: 1*3315 {underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGTCCTTTC GGGCAGCCAG GCTCAGCATG AGGAACAGAA GGAATGACAC TCTGGACAGC 60 

ACCCGGACCC TGTACTCCAG CGCGTCTCGG AGCACAGACT TGTCTTACAG TGAAAGCGAC 120 

TTGGTGAATT TTATTCAAGC AAATTTTAAG AAACGAGAAT GTGTCTTCTT TACCAAAGAT 180 

TCCAAGGCCA CGGAGAATGT GTGCAAGTGT GGCTATGCCC AGAGCCAGCA CATGGAAGGC 240 

ACCCAGATCA ACCAAAGTGA GAAATGGAAC TACAAGAAAC ACACCAAGGA ATTTCCTACC 300 

GACGCCTTTG GGGATATTCA GTTTGAGACA CTGGGGAAGA AAGGGAAGTA TATACGTCTG 360 

TCCTGCGACA CGGACGCGGA AATCCTTTAC GAGCTGCTGA CCCAGCACTG GCACCTGAAA 420 

ACACCCAACC TGGTCATTTC TGTGACCGGG GGCGCCAAGA ACTTCGCCCT GAAGCCGCGC 480 

ATCCGCAAGA TCTTCAGCCG GCTCATCTAC ATCGCGCAGT CCAAAGGTGC TTGGATTCTC 540 

, - ACGGGAGGCA CCCATTATGG CCTGATGAAG TACATCGGGG AGGTGGTGAG AGATAACACC 600 

65 ATCAGCAGGA GTTCAGAGGA GAATATTGTG GCCATTGGCA TAGCAGCTTG GGGCATGGTC 660 

TCCAACCGGG ACACCCTCAT CAGGAATTGC GATGCTGAGG GCTATTTTTT AGCCCAGTAC 720 

CTTATGGATG ACTTCACAAG AGATCCACTG TATATCCTGG ACAACAACCA CACACATTTG 780 

CTGCTCGTGG ACAATGGCTG TCATGGACAT CCCACTGTCG AAGCAAAGCT CCGGAATCAG 840 

CTAGAGAAGT ATATCTCTGA GCGCACTATT CAAGATTCCA ACTATGGTGG CAAGATCCCC 900 

70 ATTGTGTGTT TTGCCCAAGG AGGTGGAAAA GAGACTTTGA AAGCCATCAA TACCTCCATC 960 

AAAAATAAAA TTCCTTGTGT GGTGGTGGAA GGCTCGGGCC AGATCGCTGA TGTGATCGCT 1020 

AGCCTGGTGG AGGTGGAGGA TGCCCTGACA TCTTCTGCCG TCAAGGAGAA GCTGGTGCGC 1080 

TTTTTACCCC GCACGGTGTC CCGGCTGCCT GAGGAGGAGA CTGAGAGTTG GATCAAATGG 1140 

_ CTCAAAGAAA TTCTCGAATG TTCTCACCTA TTAACAGTTA TTAAAATGGA AGAAGCTGGG 1200 

75 GATGAAATTG TGAGCAATGC CATCTCCTAC GCTCTATACA AAGCCTTCAG CACCAGTGAG 1260 

CAAGACAAGG ATAACTGGAA TGGGCAGCTG AAGCTTCTGC TGGAGTGGAA CCAGCTGGAC 1320 

TTAGCCAATG ATGAGATTTT CACCAATGAC CGCCGATGGG AGTCTGCTGA CCTTCAAGAA 1380 

GTCATGTTTA CGGCTCTCAT AAAGGACAGA CCCAAGTTTG TCCGCCTCTT TCTGGAGAAT 1440 

0f% GGCTTGAACC TACGGAAGTT TCTCACCCAT GATGTCCTCA CTGAACTCTT CTCCAACCAC 1500 

80 TTCAGCACGC TTGTGTACCG GAATCTGCAG ATCGCCAAGA ATTCCTATAA TGATGCCCTC 1560 
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CTCACGTTTG TCTGGAAACT GGTTGCGAAC TTCCGAAGAG GCTTCCGGAA GGAAGACAGA 1620 

AATGGCCGGC ACGAGATGGA CATAGAACTC CACGACGTGT CTCCTATTAC TCGGCACCCC 1680 

CTGCAAGCTC TCTTCATCTG GGCCATTCTT CAGAATAAGA AGGAACTCTC CAAAGTCATT 1740 

TGGGAGCAGA CCAGGGGCTG CACTCTGGCA GCCCTGGGAG CCAGCAAGCT TCTGAAGACT 1800 

5 CTGGCCAAAG TGAAGAACGA CATCAATGCT GCTGGGGAGT CCGAGGAGCT GGCTAATGAG 1860 

TACGAGACCC GGGCTGTTGA GCTGTTCACT GAGTGTTACA GCAGCGATGA AGACTTGGCA 1920 

GAACAGCTGC TGGTCTATTC CTGTGAAGCT TGGGGTGGAA GCAACTGTCT GGAGCTGGCG 1980 

GTGGAGGCCA CAGACCAGCA TTTCATCGCC CAGCCTGGGG TCCAGAATTT TCTTTCTAAG 2040 

CAATGGTATC GAGAGATTTC CCGAGACACC AAGAACTGGA AGATTATCCT GTGTCTGTTT 2100 

10 ATTATACCCT TGGTGGGCTG TGGCTTTGTA TCATTTAGGA AGAAACCTGT CGACAAGCAC 2160 

AAGAAGCTGC TTTGGTACTA TGTGGCGTTC TTCACCTCCC CCTTCGTGGT CTTCTCCTGG 2220 

AATGTGGTCT TCTACATCGC CTTCCTCCTG CTGTTTGCCT ACGTGCTGCT CATGGATTTC 2280 

CATTCGGTGC CACACCCCCC CGAGCTGGTC CTGTACTCGC TGGTCTTTGT CCTCTTCTGT 2340 

GATGAAGTGA GACAGTGGTA CGTAAATGGG GTGAATTATT TTACTGACCT GTGGAATGTG 2400 

15 ATGGACACGC TGGGGCTTTT TTACTTCATA GCAGGAATTG TATTTCGGCT CCACTCTTCT 2460 

AATAAAAGCT CTTTGTATTC TGGACGAGTC ATTTTCTGTC TGGACTACAT TATTTTCACT 2520 

CTAAGATTGA TCCACATTTT TACTGTAAGC AGAAACTTAG GACCCAAGAT TATAATGCTG 2580 

CAGAGGATGC TGATCGATGT GTTCTTCTTC CTGTTCCTCT TTGCGGTGTG GATGGTGGCC 2640 

TTTGGCGTGG CCAGGCAAGG GATCCTTAGG CAGAATGAGC AGCGCTGGAG GTGGATATTC 2700 

20 CGTTCGGTCA TCTACGAGCC CTACCTGGCC ATGTTCGGCC AGGTGCCCAG TGACGTGGAT 2760 

GGTACCACGT ATGACTTTGC CCACTGCACC TTCACTGGGA ATGAGTCCAA GCCACTGTGT 2820 

GTGGAGCTGG ATGAGCACAA CCTGCCCCGG TTCCCCGAGT GGATCACCAT CCCCCTGGTG 2880 

TGCATCTACA TGTTATCCAC CAACATCCTG CTGGTCAACC TGCTGGTCGC CATGTTTGGC 2940 

TACACGGTGG GCACCGTCCA GGAGAACAAT GACCAGGTCT GGAAGTTCCA GAGGTACTTC 3000 

CTGGTGCAGG AGTACTGCAG CCGCCTCAAT ATCCCCTTCC CCTTCATCGT CTTCGCTTAC 3060 

TTCTACATGG TGGTGAAGAA GTGCTTCAAG TGTTGCTGCA AGGAGAAAAA CATGGAGTCT 3120 

TCTGTCTGCT GTTTCAAAAA TGAAGACAAT GAGACTCTGG CATGGGAGGG TGTCATGAAG 3180 

GAAAACTACC TTGTCAAGAT CAACACAAAA GCCAACGACA CCTCAGAGGA AATGAGGCAT 3240 

CGATTTAGAC AACTGGATAC AAAGCTTAAT GATCTCAAGG GTCTTCTGAA AGAGATTGCT 3300 
AATAAAATCA AATGA 



25 
30 
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SEQ ID NO:38 PBH1 Protein sequence 
Protein Accession #: XP_01 77 J 8 



1 11 21 31 41 51 

I I I I I ! 

MSFRAARLSM RNRRNDTLDS TRTLYSSASR STDLSYSESD LVNFIQANFK KRECVFFTKD 60 

SKATENVCKC GYAQSQHMEG TQINQSEKWN YKKHTKEFPT DAFGDIQFET LGKKGKYIRL 120 

SCDTDAEILY ELLTQHWHLK TPNLVISVTG GAXNFALKPR MRKIFSRLIY IAQSKGAWIL 180 

40 TGGTHYGLMK YIGEWRDNT ISRSSEENIV AIGIAAWGMV SNRDTLIRNC DAEGYFLAQY 240 

LMDDFTRDPL YILDNNHTHL LLVDNGCHGH PTVEAKLRNQ LEKYISERTI QDSNYGGKIP 300 

IVCFAQGGGK ETLKAINTSI KNKIPCVWE GSGQIADVIA SLVEVEDALT SSAVKEKLVR 360 

FLPRTVSRLP EEETESWIKW LKEILECSHL LTVIKMEEAG DEIVSNAISY ALYKAFSTSE 420 

QDKDNWNGQL KLLLEWNQLD LANDEIFTND RRWESADLQE VHFTALIKDR PKFVRLFLEN 480 

45 GLNLRKFLTH DVLTBLFSNH FSTLVYRNLQ IAKNSYNDAL LTFVWKLVAN FRRGFRKEDR 540 

NGRDEMDIEL KDVSPITRHP LQALFIWAIL QNKKELSKVI WEQTRGCTLA ALGASKLLKT 600 

LAKVKNDINA AGESEELANE YETRAVELFT ECYSSDEDLA EQLLVYSCEA WGGSNCLELA 660 

VEATDQHFIA QPGVQNFLSK QWYGEISRDT KNWKI1LCLF IIPLVGCGFV SFRKKPVDKH 720 

KKLLWYYVAF FTSPFWFSW NWFYIAFLL LFAYVLLMDF HSVPHPPELV LYSLVFVLFC 780 

50 DEVRQWYVNG VNYFTDLWNV KDTLGLFYFI AGTVFRLHSS NKSSLYSGRV IFCLDYIIFT 840 

LRLIHIFTVS RNLGPRIIML QRMLIDVFFF LFLFAVWMVA FGVARQGILR QNEQRWRWIF 900 

RSVIYEPYLA MFGQVPSDVD GTTYDFAHCT FTGNESKPLC VELDEHNLPR FPEWITIPLV 960 

CIYMLSTNIL LVNLLVAMFG YTVGTVQENN DQVWKFQRYF LVQEYCSRLN IPFPFIVFAY 1020 

FYMWKKCFK CCCKEKNMES SVCCFKNEDN ETLAWEGVKK ENYLVKIKTK AIIBTSEEMRH 1080 
RFRQLDTKLN DLKGLLKEIA NKIK 



SEQ 10 NO:39 P8H3 DNA SEQUENCE 

Nucleic Add Accession #: XM J) 11804 

Coding sequence: 1-558 (underlined sequences correspond to start and stop codons) 



l 11 21 31 41 51 

I I I ! I I 

ATGCCTCGCC TGTTCTTGTT CCACCTGCTA GAATTCTGTT TACTACTGAA CCAATTTTCC 60 

^- AGAGCAGTCG CGGCCAAATG GAAGGACGAT GTTATTAAAT TATGCGGCCG CGAATTAGTT 120 

65 CGCGCGCAGA TTGCCATTTG CGGCATGAGC ACCTGGAGCA AAAGGTCTCT GAGCCAGGAA 180 

GATGCTCCTC AGACACCTAG ACCAGTGGCA GAAATTGTAC CATCCTTCAT CAACAAAGAT 240 

ACAGAAACTA TAATTATCAT GTTGGAATTC ATTGCTAATT TGCCACCGGA GCTGAAGGCA 300 

GCCCTATCTG AGAGGCAACC ATCATTACCA GAGCTACAGC AGTATGTACC TGCATTAAAG 360 

n GATTCCAATC TTAGCTTTGA AGAATTTAAG AAACTTATTC GCAATAGGCA AAGTGAAGCC 420 

70 GCAGACAGCA ATCCTTCAGA ATTAAAATAC TTAGGCTTGG ATACTCATTC TCAAAAAAAG 480 

AGACGACCCT ACGTGGCACT GTTTGAGAAA TGTTGCCTAA TTGGTTGTAC CAAAAGGTCT 540 
CTTGCTAAAT ATTGCTGA 

SEP ID NO:40 PBH3 PROTEIN SEQUENCE 

75 Protein Accession •: NPJ508842 

1 11 21 31 41 SI 

I I I I I I 

MPRLFLFHLL EFCLLLNQFS RAVAAKWKDD VIKLCGRELV RAQIAICGMS TWSKRSLSQE 60 



315 



WO 02/30268 



DAPQTPRPVA EIVPSFINKD TETIIIMLEF IANLPPELKA ALSERQPSLP ELQQYVPALK 120 

DSNLSPEEFK KL1RNRQSEA ADSNPSELKY LGLDTHSQKK RRPYVALFEK CCLIGCTKRS 180 
LAKYC 

SEO ID N0:41 PBH5 DMA SEQUENCE 

Nudeic Add Accession •: NM.0O584S 

Coding sequence: 1-3978 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I 1 

ATGCTGCCCG TGTACCAGGA GGTGAAGCCC AACCCGCTGC AGGACGCGAA CCTCTGCTCA 60 

CGCGTGTTCT TCTGGTGGCT CAATCCCTTG TTTAAAATTG GCCATAAACG GAGATTAGAG 120 

GAAGATGATA TGTATTCAGT GCTGCCAGAA GACCGCTCAC AGCACCTTGG AGAGGAGTTG 180 

CAAGGGTTCT GGGATAAAGA AGTTTTAAGA GCTGAGAATG ACGCACAGAA GCCTTCTTTA 240 

ACAAGAGCAA TCATAAAGTG TTACTGGAAA TCTTATTTAG TTTTGGGAAT TTTTACGTTA 300 

ATTGAGGAAA GTGCCAAAGT AATCCAGCCC ATATTTTTGG GAAAAATTAT TAATTATTTT 360 

GAAAATTATG ATCCCATGGA TTCTGTGGCT TTGAACACAG CGTACGCCTA TGCCACGGTG 420 

CTGACTTTTT GCACGCTCAT TTTGGCTATA CTGCATCACT TATATTTTTA TCACGTTCAG 480 

TGTGCTGGGA TGAGGTTACG AGTAGCCATG TGCCATATGA TTTATCGGAA GGCACTTCGT 540 

CTTAGTAACA TGGCCATGGG GAAGACAACC ACAGGCCAGA TAGTCAATCT GCTGTCCAAT 600 

GATGTGAACA AGTTTGATCA GGTGACAGTC TTCTTACACT TCCTGTGGGC AGGACCACTG 660 

CAGGCGATCG CAGTGACTGC CCTACTCTGG ATGGAGATAG GAATATCGTG CCTTGCTGGG 720 

ATGGCAGTTC TAATCATTCT CCTGCCCTTG CAAAGCTGTT TTGGGAAGTT GTTCTCATCA 780 

CTGAGGAGTA AAACTGCAAC TTTCACGGAT GCCAGGATCA GGACCATGAA TGAAGTTATA 840 

ACTGGTATAA GGATAATAAA AATGTACGCC TGGGAAAAGT CATTTTCAAA TCTTATTACC 900 

AATTTGAGAA AGAAGGAGAT TTCCAAGATT CTGAGAAGTT CCTGCCTCAG GGGGATGAAT 960 

TTGGCTTCGT TTTTCAGTGC AAGCAAAATC ATCGTGTTTG TGACCTTCAC CACCTACGTG 1020 

CTCCTCGGCA GTGTGATCAC AGCCAGCCGC GTGTTCGTGG CAGTGACGCT GTATGGGGCT 1080 

GTGCGGCTGA CGGTTACCCT CTTCTTCCCC TCAGCCATTG AGAGGGTGTC AGAGGCAATC 1140 

GTCAGCATCC GAAGAATCCA GACCTTTTTG CTACTTGATG AGATATCACA GCGCAACCGT 1200 

CAGCTGCCGT CAGATGGTAA AAAGATGGTG CATGTGCAGG ATTTTACTGC • TTTTTGGGAT 1260 

AAGGCATCAG AGACCCCAAC TCTACAAGGC CTTTCCTTTA CTGTCAGACC TGGCGAATTG 1320 

TTAGCTGTGG TCGGCCCCGT GGGAGCAGGG AAGTCATCAC TGTTAAGTGC CGTGCTCGGG 1380 

GAATTGGCCC CAAGTCACGG GCTGGTCAGC GTGCATGGAA GAATTGCCTA TGTGTCTCAG 1440 

CAGCCCTGGG TGTTCTCGGG AACTCTGAGG AGTAATATTT TATTTGGGAA GAAATACGAA 1500 

AAGGAACGAT ATGAAAAAGT CATAAAGGCT TGTGCTCTGA AAAAGGATTT ACAGCTGTTG 1560 

GAGGATGGTG ATCTGACTGT GATAGGAGAT CGGGGAACCA CGCTGAGTGG AGGGCAGAAA 1620 

GCACGGGTAA ACCTTGCAAG AGCAGTGTAT CAAGATGCTG ACATCTATCT CCTGGACGAT 1680 

CCTCTCAGTG CAGTAGATGC GGAAGTTAGC AGACACTTGT TCGAACTGTG TATTTGTCAA 1740 

ATTTTGCATG AGAAGATCAC AATTTTAGTG ACTCATCAGT TGCAGTACCT CAAAGCTGCA 1800 

AGTCAGATTC TGATATTGAA AGATGGTAAA ATGGTGCAGA AGGGGACTTA CACTGAGTTC 1860 

CTAAAATCTG GTATAGATTT TGGCTCCCTT TTAAAGAAGG ATAATGAGGA AAGTGAACAA 1920 

CCTCCAGTTC CAGGAACTCC CACACTAAGG AATCGTACCT TCTCAGAGTC TTCGGTTTGG 1980 

TCTCAACAAT CTTCTAGACC CTCCTTGAAA GATGGTGCTC TGGAGAGCCA AGATACAGAG 2040 

AATGTCCCAG TTACACTATC AGAGGAGAAC CGTTCTGAAG GAAAAGTTGG TTTTCAGGCC 2100 

TATAAGAATT ACTTCAGAGC TGGTGCTCAC TGGATTGTCT TCATTTTCCT TATTCTCCTA 2160 

AACACTGCAG CTCAGGTTGC CTATGTGCTT CAAGATTGGT GGCTTTCATA CTGGGCAAAC 2220 

AAACAAAGTA TGCTAAATGT CACTGTAAAT GGAGGAGGAA ATGTAACCGA GAAGCTAGAT 2280 

CTTAACTGGT ACTTAGGAAT TTATTCAGGT TTAACTGTAG CTACCGTTCT TTTTGGCATA 2340 

GCAAGATCTC TATTGGTATT CTACGTCCTT GTTAACTCTT CACAAACTTT GCACAACAAA 2400 

ATGTTTGAGT CAATTCTGAA AGCTCCGGTA TTATTCTTTG ATAGAAATCC AATAGGAAGA 2460 

ATTTTAAATC GTTTCTCCAA AGACATTGGA CACTTGGATG ATTTGCTGCC GCTGACGTTT 2520 

TTAGATTTCA TCCAGACATT GCTACAAGTG GTTGGTGTGG TCTCTGTGGC TGTGGCCGTG 2580 

ATTCCTTGGA TCGCAATACC CTTGGTTCCC CTTGGAATCA TTTTCATTTT TCTTCGGCGA 2640 

TATTTTTTGG AAACGTCAAG AGATGTGAAG CGCCTGGAAT CTACAACTCG GAGTCCAGTG 2700 

TTTTCCCACT TGTCATCTTC TCTCCAGGGG CTCTGGACCA TCCGGGCATA CAAAGCAGAA 2760 

GAGAGGTGTC AGGAACTGTT TGATGCACAC CAGGATTTAC ATTCAGAGGC TTGGTTCTTG 2820 

TTTTTGACAA CGTCCCGCTG GTTCGCCGTC CGTCTGGATG CCATCTGTGC CATGTTTGTC 2880 

ATCATCGTTG CCTTTGGGTC CCTGATTCTG GCAAAAACTC TGGATGCCGG GCAGGTTGGT 2940 

TTGGCACTGT CCTATGCCCT CACGCTCATG GGGATGTTTC AGTGGTGTGT TCGACAAAGT 3000 

GCTGAAGTTG AGAATATGAT GATCTCAGTA GAAAGGGTCA TTGAATACAC AGACCTTGAA 3060 

AAAGAAGCAC CTTGGGAATA TCAGAAACGC CCACCACCAG CCTGGCCCCA TGAAGGAGTG 3120 

ATAATCTTTG ACAATGTGAA CTTCATGTAC AGTCCAGGTG GGCCTCTGGT ACTGAAGCAT 3180 

CTGACAGCAC TCATTAAATC ACAAGAAAAG GTTGGCATTG TGGGAAGAAC CGGAGCTGGA 3240 

AAAAGTTCCC TCATCTCAGC CCTTTTTAGA TTGTCAGAAC CCGAAGGTAA AATTTGGATT 3300 

GATAAGATCT TGACAACTGA AATTGGACTT CACGATTTAA GGAAGAAAAT GTCAATCATA 3360 

CCTCAGGAAC CTGTTTTGTT CACTGGAACA ATGAGGAAAA ACCTGGATCC CTTTAATGAG 3420 

CACACGGATG AGGAACTGTG GAATGCCTTA CAAGAGGTAC AACTTAAAGA AACCATTGAA 3480 

GATCTTCCTG GTAAAATGGA TACTGAATTA GCAGAATCAG GATCCAATTT TAGTGTTGGA 3540 

CAAAGACAAC TGGTGTGCCT TGCCAGGGCA ATTCTCAGGA AAAATCAGAT ATTGATTATT 3600 

GATGAAGCGA CGGCAAATGT GGATCCAAGA ACTGATGAGT TAATACAAAA AAAAATCCGG 3660 

GAGAAATTTG CCCACTGCAC CGTGCTAACC ATTGCACACA GATTGAACAC CATTATTGAC 3720 

AGCGACAAGA TAATGGTTTT AGATTCAGGA AGACTGAAAG AATATGATGA GCCGTATGTT 3780 

TTGCTGCAAA ATAAAGAGAG CCTATTTTAC AAGATGGTGC AACAACTGGG CAAGGCAGAA 3840 

GCCGCTGCCC TCACTGAAAC AGCAAAACAG GTATACTTCA AAAGAAATTA TCCACATATT 3900 

GGTCACACTG ACCACATGGT TACAAACACT TCCAATGGAC AGCCCTCGAC CTTAACTATT 3960 
TTCGAGACAG cactgtga 
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SEP ID NO:42 PBH5 PROTEIN SEQUENCE 

Protein Accession t: NP_005836 

- 1 11 21 31 41 51 

5 I I I III 

MLPVYQEVKP NPLQDANLCS RVFFWWLNPL FKIGHKRRLE EDDHYSVLPB DRSQHLGEEL 60 

QGFWDKEVLR AENDAQKPSL TRAIIKCYWK SYLVLGIFTL IEESAKVIQP IFLGKIINYF 120 

ENYDPHDSVA LNTAYAYATV LTFCTLILAI LHHLYFYHVO CAGMRLRVAM CHHXYKKALR 180 

LSNHAMGKTT TGQIVNLLSN DVNKFDQVTV FLHFLWAGPL QAIAVTALLW MEIGISCUAG 240 

MAVLIILLPL QSCPGKLFSS LRSKTATFTD ARIRTMNEVI TGIRIIKHYA WEKSFSNLIT 300 

NLRKKEISRI LRSSCLRGMN LAS F PS ASK I IVFVTFTTYV LLGSVITASR VFVAVTLYGA 360 

VRLTVTLFFP SAIERVSEAI VSIRRIQTFL LLDEISQRNR QLPSDGKKMV HVQDFTAFWD 420 

KASETPTLOG LSFTVRPGEL LAWGPVGAG KSSLLSAVLG ELAPSHGLVS VHGRIAYVSQ 480 

QPWVFSGTLR SNILFGKKYE KERYEKVIKA CALKKDLQLL EDGDLTVIGD RGTTLSGGQK 540 

ARVNLARAVY QDADIYLLDD PLSAVDAEVS RHLFELCICO ILHEKITILV THQLQYLKAA 600 

SQILILKDGK MVQKGTYTEF LKSGIDFGSL LKKDNEESEQ PPVPGTPTLR NRTFSESSVW 660 

SQQSSRPSLK DGALESQDTE NVPVTLSEEN RSEGKVGFQA YKNYFRAGAH WIVFIFLILL 720 

NTAAQVAYVL QDWWLSYWAN KQSMLNVTVN GGGNVTEXLD LNWYLGIYSG LTVATVLFGI 780 

ARSLLVFYVL VNSSQTLHNK MFESILKAPV LFFDRNPIGR ILNRFSKDIG HLDDLLPLTF 840 

20 LDPIQTLLQV VGWSVAVAV IPWIAIPLVP LGIIFIFLRR YFLETSRDVK RLESTTRSPV 900 

FSHLSSSLQG LOTIRAYKAE ERCQELFDAH QDLHSEAWFL FLTTSRWFAV RLDAICAMFV 960 

IIVAFGSLIL AKTLDAGOVG LALSYALTLM GMFQWCVRQS AEVENMMISV ERVIEYTDLB 1020 

KEAPWEYQKR PPPAWPHEGV IIFDNVNFMY SPGGPLVLKH LTALIKSQEK VGIVGRTGAG 1080 

K5SLISALFR LSEPEGKIWI DKILTTEIGL HDLRKKMSII PQEPVLFTGT MRKNLDPFNE 1140 

25 HTDEELWNAL QEVQLKETIE DLPGKMDTEL AESGSNFSVG QRQLVCLARA ILRKNQILII 1200 

DEATANVDPR TDEXIQKKIR EKFAHCTVLT IAHRLNTIID SDKIHVLDSG RLKEYDEPYV 1260 

LLQNKESLFY KHVQQLGKAE AAALTETAKQ VYFKRNYPHI GHTDHMVTNT SNGQPSTLTI 1320 
FETAL 

30 SEQ ID NO:43 PBQ7 DNA SEQUENCE 

Nucleic Acid Accession r. NM.02I233 

Cwfing sequence: 34-1 1 1 9 (undefined sequences correspond to start and stop codons) 

„ 1 • 11 21 31 41 si 

35 | | | | | | 

ATGGGGAAAG TGTCCTGCTG TGGCATGAAA TAAATGAAAC AGAAAATGAT GGCAAGACTG 60 

CTAAGAACAT CCTTTGCTTT GCTCTTCCTT GGCCTCTTTG GGGTGCTGGG GGCAGCAACA 120 

ATTTCATGCA GAAATGAAGA AGGGAAAGCT GTGGACTGGT TTACTTTTTA TAAGTTACCT 180 

AAAAGACAAA ACAAGGAAAG TGGAGAGACT GGGTTAGAGT ACCTGTACCT AGACTCTACA 240 

40 ACTAGAAGCT GGAGGAAGAG TGAGCAACTA ATGAATGACA CCAAGAGTGT TTTGGGAAGG 300 

ACATTACAAC AGCTATATGA AGCATATGCC TCTAAGAGTA ACAACACAGC CTATCTAATA 360 

TACAATGATG GAGTCCCTAA ACCTGTGAAT TACAGTAGAA AGTATGGACA CACCAAAGGT 420 

TTACTGCTGT GGAACAGAGT TCAAGGGTTC TGGCTGATTC ATTCCATCCC TCAGTTTCCT 480 

CCAATTCCGG AAGAAGGCTA TGATTATCCA CCCACAGGGA GACGAAATGG ACAAAGTGGC 540 

ATCTGCATAA CTTTCAAGTA CAACCAGTAT GAGGCAATAG ATTCTCAGCT CTTGGTCTGC 600 

AACCCCAACG TCTATAGCTG CTCCATCCCA GCCACCTTTC ACCAGGAGCT CATTCACATG 660 

CCCCAGCTGT GCACCAGGGC CAGCTCATCA GAGATTCCTG GCAGGCTCCT CACCACACTT 720 

CAGTCGGCCC AGGGACAAAA ATTCCTCCAT TTTGCAAAGT CGGATTCTTT TCTTGACGAC 780 

ATCTTTGCAG CCTGGATGGC TCAACGGCTG AAGACACACT TGTTAACAGA AACCTGGCAG 840 

CGAAAAAGAC AAGAGCTTCC TTCAAACTGC TCCCTTCCTT ACCATGTCTA CAATATAAAA 900 

GCAATTAAAT TATCACGACA CTCTTATTTC AGTTCTTATC AAGATCACGC CAAGTGGTGT 960 

ATTTCCCAAA AGGGCACCAA AAATCGCTGG ACATGTATTG GAGACCTAAA TCGGAGTCCA 1020 

CACCAAGCCT TCAGAAGTGG AGGATTCATT TGTACCCAGA ATTGGCAAAT TTACCAAGCA 1080 
TTTCAAGGAT TAGTATTATA CTATGAAAGC TGTAAGTAAA CTTGGTGAAA GGACACAGGT 



45 
50 
55 



65 



SEQ ID NO:44 PBQ7 Protein secuence 
Protein Accession #: NP_067056 



1 11 21 31 41 51 

60 | | i | | l 

MMARLLRTSF ALLFLGLFGV LGAATISCRN EEGKAVDWFT FYKLPKRQNK ESGETGLEYL 60 

YLDSTTRSWR KSEQLMNDTK SVLGRTLQOL YEAYASKSNN TAYLIYNDGV PKPVNYSRKY 120 

GHTKGLLLWN RVQGFWLIHS IPQFPPIPEE GYDYPPTGRR NGQSGICITF KYNQYEAIDS 180 

OLLVCNPNVY SCSIPATFHQ ELIHMPQLCT RASSSEIPGR LLTTLQSAQG QKFLHFAKSD 240 

SFLDDIFAAW MAQRLKTHLL TETWQRKRQE LPSNCSLPYH VYNIKAIKLS RHSYFSSYQD 300 
HAKWCISQKG TKNRWTCIGD LNRSPHQAFR SGGF1CTQNW QIYQAFQGLV LYYESCK 



SEO ID Na45 PCQ8 DNA SEQUENCE 

Nucleic Add Accession #: XM.03O453 
70 Coding sequence: 89-1273 (underlined sequences correspond to slart and stop codons) 

1 11 21 31 41 SI 

I I I I I I 

__ CGGTGCCCTG GGGTGGAATA TCCCCTACGA ATTTAACCAA GCGGACTTTA ATGCCACTGT 60 

75 GCAGTTCATC CAAAACCACT TGGATGACAT GGATGTCAAA AAGGGTGTCT CCTGGACCAC 120 

CATCCGCTAC ATGATAGGAG AGATTCAATA TGGAGGCAGA GTCACTGACG ACTATGATAA 180 

GAGATTGTTG AACACATTTG CTAAGGTTTG GTTCAGTGAA AATATGTTTG GACCAGATTT 240 

CAGTTTTTAC CAAGGATACA ATATTCCAAA ATGCAGCACA GTGGATAACT ATCTTCAGTA 300 

TATCCAGAGT TTGCCTGCCT ATGACAGCCC TGAGGTGTTT GGGCTGCACC CCAATGCTGA 360 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



CATCACCTAC 
CAAGGACACC 
TGATATGCTG 
GAAGATGGGG 
AAGGGTACTC 
CATCATCATG 
TGCTTGGTGG 
TATA6AAAGA 
GATGACGGGT 
GGCCAACAAA 
GAAGGACGAC 
AGGTGCTGGC 
TGAGTTGATG 
TTACTCCTGT 
GGATCTCAGG 
TGATGTCAAG 
AATTATTGTA 
AATTAATGAG 
TCATCATTAG 
ATTTACAGCA 
TTCGCTCCGA 
AGATGGCAAG 
AAATGTGATG 
ACTTTCTATG 
CTTTAAATTC 
TAGTCAGTAC 
ACTGCATTTT 
CCTCTCACTG 
GTTAGGACTG 
TTCTTTCTTT 
CATTTCATTT 
TTTACTAAAA 



CAGAGCAAGC 
TCTGGTGGAG 
GAGAAGCTGC 
CCATTCCAGC 
AGCCTTGTCC 
AGCGAAAATC 
AAAAAAGCTT 
AACAGCCAGT 
TTTTTTAACC 
GGCTGGGCTC 
ATTTCTACCC 
TGGGACAAGA 
CCTGTCATAA 
CCCATCTATA 
ACAGCCCAGA 
TAACATGTGG 
ACCTTTATTT 
CTGCATAGGT 
TGACCAATGT 
TCCTAATGAA 
AGACTGACTG 
ATAGAAAAAT 
ATCAGGA6AA 
GACTTTTATT 
TGGTTAGATG 
TAAATTAGAA 
TTTGGATAAA 
GGCTTCATTC 
AGTGGCTCCT 
TGGTGTGGAT 
TGAAAAGCAA 
AAAAAAAAAA 



TGGCCAAGGA 
GGGATGAGAC 
CCCCAGACTA 
CTATGAACAT 
GCAGCACCCT 
TGCAAGATGC 
CTTGGGTTTT 
TTACCTCGTG 
CCCAGGGATT 
TGGACAATAT 
CTCCCACAGA 
GGAACATGAA 
GGATTTATGC 
AGAAGCCAGT 
CCCCTGAACA 
GGAGTGTCCC 
CTGTATGACT 
TTTCCCCACT 
CTGAGTTTGT 
GTGTGGCCCT 
TGATTATAAC 
AAGAACAGAT 
AAAATAAAAA 
AATTAGGAAA 
TTATTAATAA 
TTGTGGTTTA 
CAGTTTTTGG 
TGTGGACCAG 
GTGACTCCCA 
TAGTATATCA 
GTAATGAAAA 
AAA 



CGTGCTGGAC 
CCGGGAGGCG 
TGTCCCCTTT 
TTTCCTCAGG 
CACTGAGCTG 
ATTGGATTGC 
TAGTACACTG 
GGTTTTCAAT 
TTTAACTGCA 
GGTGCTTTGC 
GGGTGTCTAT 
ACTCATTGAA 
AGAAAACAAT 
TCGAACGGAC 
CTGGGTGCTC 
CACCCAATGC 
GCTGGACAGT 
CCTTAATTGG 
TGAAAATGTT 
CAAATCCACA 
AGCAAATATA 
GTGATAGCAA 
AAGGGTAGAA 
CATTATCAAA 
TTCTTCATCT 
TAAACTTTTG 
TAGGTGGATA 
GATCATTATT 
CCATCTTAGA 
GTTGATTTGT 
TGTCAGCATC 



SEQ ID NO:45 PCQB Protein sequence 
Protein Accession*: BABI5543 



31 



1 11 21 

till 

MDVKKGVSWT TIRYMIGEIQ YGGRVTDDYD KRLLNTFAKV 
KCSTVDNYLQ YIQSLPAYDS PEVFGLKPNA DITYQSKLAK 
TREAWARLA DDMLEKLPPD YVPFEVKERL QKMGPFQPMN 
LTELKLAIDG TIIMSENLQD ALDCMFDAR1 PAWWKKASWV 
WVFNGRPHCF WMTGFFNPQG FLTAMRQEIT RANKGWALDN 
EGVYVYGLYL EGAGWDKRNM KLIESKPKVL FELMFVIRIY 
VRTDLNYIAA VDLRTAQTPE HWVLRGVALL CDVK 



ACCATCCTAG 
GTGGTGGCCC 
GAAGTAAAAG 
CAGGAAATAG 
AAACTTGCTA 
ATGTTTGATG 
GGTTTCTGGT 
GGCCGACCTC 
ATGCGACAGG 
AATGAAGTCA 
GTCTATGGCT 
TCAAAGCCAA 
ACTTTACGAG 
TTGAACTACA 
CGTGGGGTTG 
TTTGGAAAAT 
GTATGTTAGG 
ATGCTTATAT 
ATTTAGTGAT 
GTAGTATATT 
TTTGCATGTG 
GAATTATAGT 
ATATTAGACG 
GGAACTTTTC 
AACCTACTGA 
GTTAGCTCTG 
CCGGGAGACA 
TCATGCTCAT 
TGATACTGTT 
GTGAATTGTG 
ATAGGAATTA 



41 
I 

WFSENMFGPD 
DVLDTILGIQ 
IFLRQEIORH 
FSTLGFWFTE 
MVLCNEVTKW 
AENNTLRDPR 



GCATCCAACC 
GGCTGGCTGA 
AGAGGCTGCA 
ACAGAATGCA 
TTGATGGCAC 
CTAGAATCCC 
TTACTGAACT 
ACTGCTTTTG 
AAATAACTCG 
CCAAATGGAT 
TATATCTTGA 
AAGTGCTCTT 
ATCCTCGGTT 
TTGCCGCTGT 
CCCTTCTGTG 
GCAAGATCTA 
TCGTTTATGC 
TTTACTTGTT 
ATAAAAGTAA 
TTCTTCTTAC 
GACAAAGATT 
TGGCTTGAAA 
GTGCGTAGGG 
ACGTATTTTT 
CTAGAAAATA 
GATCTGTATA 
AGTGTGGGTC 
GATCATGAGA 
TTCTTGTGAG 
GTGAAACAAT 
ATAAAATGTT 



51 
I 

FSFYQGYNIP 
PKDTSGGGDE 
QRVLSLVRST 
LIERNSQFTS 
KKDDISTPPT 
FYSCPIYKKP 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 



GGAGCAGCCT 
AGATGACATG 
ACAGCOCATA 
AGATGCAGCT 
AAGCCTTTCT 
TATGAATCCT 
GGCCCAATCC 
TGGAAATGTT 
AGGAGATGTT 
TGATGCTGAA 
AGAACTGGCT 
CTTCTCAGAA 
CAGATGCCTC 
CAGTTATGTT 
TCTCAGACAC 
TTCAAATAAT 
TTCTCAGCCC 
TTCTATAAAA 
GGTGAACCCT 
GAGCATTTCT 
AGTTCAACAA 
GGAGCCACTA 
CTCAGAAAGC 
TTCCCAGCCC 
CAGTCCTGTG 
TGAGGAACTG 
TAAGGAGCAG 
ACTGTCCTCA 



11 
I 

ACAACTTCAC 
GGAAGGAGAA 
CCTGAAAACA 
TCTGGAGCTG 
ACAACCCAAG 
TCTCATATCC 
AAAATGGAGT 
CACCAGACCT 
TATGCCAAGA 
GAAGTCTCCT 
CATGGTCACT 
TCAAAAAGTT 
TCCCAGGCTT 
GAAAAGTACA 
CCTGCTCAGG 
ACTCCTGAAG 
ATTATGAATC 
CAGAGCGATT 
AAAGTGGAGC 
ATGAAGCCTC 
AACATGTTCT 
CTCCCCAGAT 
ACAGCTGTTG 
TCGGAGAGGC 
GCACCAACAC 
TATCAACTCT 
CTGCTTCCCA 
AATTTCGAGC 



21 
I 

AACCAGAAAC 
ATGCTGGCAT 
TGGACAATTC 
AGAAGACAGA 
AGGAGGCCAT 
AGTTAGAAGA 
CAGCCCAGGA 
TTACAGCAAG 
CTCTGCCTCC 
CAGATTCAGA 
CTTCCCAGTC 
TTGTTGAGGA 
TAGAGGAGCC 
ACACTTCTGA 
CCTTGGGAAA 
AGCAGAATGA 
CTACTGTTCA 
CCGTGGAGCC 
AAGAAGTTTC 
TGCCTCCTAA 
CAGGTTCAGA 
ATTCTCCTCA 
AGGAAGGCAC 
CTAAGTTCCT 
CTTCCAAATA 
CTGCACATCC 
GACATCTTTC 
GGGCTGCTAT 



31 
I 

CACTACCCCT 
AGATTTCGGA 
CATGGTTAGT 
AGCCAGAGCT 
TCTCTCAGTA 
TCAAGAAGCT 
TGTTCAAACT 
TGTTTTGGGT 
CAGAAGCCTT 
GAATATTCCT 
CTTGGGGAAG 
CTTGAGCAGC 
TGAAGATGCA 
TGATTGCAGC 
GCCCAAAAAC 
TTTTATGCAG 
GCAACAAGTC 
AATCCCTCCA 
CTCATCTCCA 
ACTTCTTTGC 
GGACATTGCT 
GTCCTTGACA 
TTATGTGGAA 
GGACTCAATG 
CACTTCCCCG 
AGAAAGCACT 
CCAGTTGACT 
TGAGGCAGAC 



41 
I 

CAGGGGTTGC 
TCCAGAAAAG 
GATCCACAAC 
TCTCTCTCAC 
GCAGCAGAGG 
TTCAGCTTTG 
ATCTGCAAAG 
ATGACAAGTA 
TTTCAGTCCT 
GAGGAGGGGG 
TTTGAAGATG 
TCTGAGGAGG 
GAAGTCTTCA 
AGCTCAGAGG 
CAACAAGAAG 
CAGCTGCCTT 
CCCACCAGTT 
AGACACCCTT 
AAGAGCATGG 
CAGCCCTTGA 
GTTGAGAGAG 
GATCCTCAAA 
CCGCTGCCTC 
AGTACTTCTG 
CCATGGGTGA 
ACTGTTGAAG 
GTGGGAAATA 
ATTTCTGGGA 



60 
120 
180 
240 
300 
360 



SEQ ID NO:47 PDG5 DNA SEQUENCE 

Nucleic Acid Accession*: AB033036 

Coding sequence: 68-3349 {underlined sequences correspond to start and stop codons) 



51 
I 

TTTCAGATAA 
CATCAGCAGC 
CATACCATGA 
TGATGGTGGA 
CTCAGGTGTT 
ATTTACAAAA 
AAAAGCCTTC 
CTACAGCCAA 
CAAGGAAGCC 
ATGGTTCTGA 
AACAAGAAGT 
AGCTGGACCT 
CAGAATCAAG 
AAGACCTGCC 
TCTCCTCTGC 
CCAGATGCCC 
CAGTGGGCAC 
TCCAGCCATG 
CTGTTGAAGA 
TGAATCCTAA 
TCATTTCTGT 
TCCGGCAAAT 
CCAGATGCCT 
CAGAATGGAG 
CCCCTAAATT 
AGGACATTTC 
AAGTCCAGCA 
GTCCATTGCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
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TCCCCAATAT GCTACCCAGT TCTTAAAGAG GTCTAAAGTT CAGGAAATGA CCTCACGACT 1740 

AGAGAAAATG GCTGTTGAAG GCACTTCTAA CAAATCACCG ATTCCCAGGC GTCCGACCCA 1800 

GTCATTCGTG AAATTTATGG CACAGCAAAT CTTTTCAGAG AGCTCTGCTC TTAAGAGGGG I860 

CAGTGATGTG GCACCTCTGC CTCCCAATCT TCCTTCCAAA TCTTTATCAA AGCCTGAAGT 1920 

CAAGCACCAA GTTTTCTCAG ATTCAGGGAG TGCTAATCCT AAGGGAGGCA TTTCTTCAAA 1980 

GATGCTACCT ATGAAGCACC CTTTACAGTC CTTGGGGAGG CCTGAAGACC CACAGAAAGT 2040 

TTTCTCTTAT TCAGAGAGAG CTCCTGGGAA GTGCAGCAGT TTTAAAGAGC AGCTGTCTCC 2100 

CAGGCAGCTT TCCCAGGCCT TGAGGAAACC TGAGTATGAG CAAAAAGTCT CCCCTGTTTC 2160 

TGCCAGTTCT CCTAAAGAGT GGAGGAATTC TAAAAAGCAG CTGCCTCCCA AACATTCTTC 2220 

CCAAGCCTCA GATAGGTCTA AATTCCAGCC ACAGATGTCA TCAAAGGGCC CAGTGAATGT 2280 

ACCTGTAAAG CAGAGCAGCG GTGAGAAGCA CCTGCCTTCA AGTAGTCCTT TCCAGCAACA 2340 

GGTTCATTCA AGTTCTGTGA ATGCTGCTGC TAGGCGATCT GTTTTTGAGA GCAATTCTGA 2400 

CAATTGGTTC CTAGGAAGAG ATGAAGCTTT TGCAATCAAA ACCAAGAAAT TCAGCCAAGG 2460 

TTCCAAAAAC CCCATAAAGA GCATTCCAGC CCCTGCTACC AAACCTGGGA AGTTCACCAT 2520 

TGCTCCTGTC AGGCAAACAT CCACTTCTGG GGGCATTTAC TCTAAGAAAG AAGATCTTGA 2580 

GAGTGGTGAT GGTAATAATA ACCAGCATGC AAACCTATCC AATCAGGATG ATGTTGAAAA 2640 

GCTTTTTGGA GTTCGACTGA AAAGAGCCCC TCCCTCGCAG AAGTATAAGA GTGAGAAACA 2700 

AGATAACTTC ACCCAGCTTG CTTCAGTGCC CTCGGGCCCA ATTTCATCCT CTGTAGGCAG 2760 

GGGACATAAA ATCAGAAGCA CTTCCCAGGG GCTCCTGGAT GCTGCAGGGA ACCTCACCAA 2820 

AATATCTTAC GTTGCAGATA AGCAACAGAG CAGGCCCAAA TCTGAAAGCA TGGCCAAGAA 2880 

GCAACCTGCT TGCAAGACCC CAGGAAAGCC TGCTGGTCAA CAGTCAGATT ATGCTGTCTC 2940 

AGAGCCGGTT TGGATAACTA TGGCAAAGCA GAAGCAGAAG AGTTTCAAGG CCCACATTTC 3000 

TGTGAAAGAG CTGAAAACTA AGAGCAATGC TGGAGCCGAT GCTGAGACTA AGGAGCCTAA 3060 

ATATGAGGGA GCTGGCTCTG CAAATGAAAA CCAACCTAAA AAGATGTTCA CTTCCAGTGT 3120 

CCATAAACAG GAGAAGACAG CACAGATGAA GCCACCTAAG CCTACAAAAT CAGTTGGATT 3180 

TGAAGCTCAG AAGATACTGC AAGTTCCTGC CATGGAAAAA GAAACCAAAC GATCTTCAAC 3240 

TCTCCCAGCC AAGTTCCAGA ACCCAGTTGA GCCAATTGAG CCTGTCTGGT TCTCACTGGC 3300 

CAGGAAGAAA GCCAAAGCAT GGAGCCACAT GGCAGAAATC ACGCAATAAA GAGCTCTTGT 3360 

GTGGAGCATC AGCATTTATT TTATTTAGTT TTTTTTTTTT TTTTTTTTTT GAGACAGAGT 3420 

CTCGCTCTGT TACCCAGATT GGAGTGCAGT GGCGCGATCT CCGCTCACTG CAAGCTCCGC 3480 

CTCCCGGGTT CACGCCACTC TCCCGCCTCA GTCTCCCGAC TAGCTGGGAC TACAGGCGCC 3540 

CGCCATCACG CCCGGCTAAT TTTGTTTTCG TATTTTTAGT AGAGACGGGG TTTCACCATG 3600 

TTGGCCAGGA TGGTCTTGAT CTCCTGACCT CGTGATCCGC CCGCCTCAGC CTCCCAAAAG 3660 

CTGGGATTAC AGGCGTGAGC CACCGCGCCC GGCCAAGCAT CAGCGTTTTA AATGATAATT 3720 

GCTAATAGCT G T ATT AATTC TATGTAGTGA TCTTTTTACT GTGACCACTT GTATTAAGCA 3780 

AAATAAGTAT TAAGCAAACT AAGAATTTAT TAAGCAAAAT AAGAATTTAT TAAGCAAAAT 3840 

AGCCTTAGAA ATGCAAATTA AAACATAATT ATTTGAATGA AATAAATGCC ATGAATGCTT 3900 

AACCTTCCAC GTAGTCACTG CCAGCACCCA GAAACCCAGC ATTTCCTCTA TTAAAACTAT 3960 

CGAAAACATT TGCACTGCTG TAAAATTGCA AAATCTTTAA CTTTGGACAA TGTGCTTTAG 4020 

AAGGGAGAAA GCAAAAACAT TTTGTTGGAG CAACTAGAAA ATTGTCATTT CCCTCAACCA 4080 

AATAAAGTAA TTCTAATGGA AACATTCAGA TGATTTGACC TAAAGATTGG CCTTTAGGTT 4140 

TTATGAGCCT AGATAGATGC CGCAATTATT TGGTTGTTGC TCTAAGCTTT GCAAGGGATC 4200 

CTAAAAGAGG CGGTGGAAGT GAAAATTCTG GGTCTCCAAG AAAATTTCTG CACAGCCAGT 4260 

TCTCCAATCA GCCTATCACC CCTTGAAACA TCTTCCCTGT GTCCCTGGGG GCCCCTGATG 4320 

CTTTCTCCTT GGGTGATAGT AACATGCAGA GCACTTACAC AAAGCTCCCT CTTTGGACAT 4380 

ACCCCACGTC GACCTGTCAC AGGCCTGGCT GTAGCGAGCA CCTCCCTATG ACGCAGAATG 4440 

CTTCTTGGGA ATTATCTTAC TCCTCTGGAG GGTTAGTCCA TCAATGTTTT GCTTCTTGTC 4500 

CCAATACTAC TGTGACCCTC TCTGATCGCA CAGAAATCAC TGCCTATCAC ATATATCCTG 4560 

TTAAGCACTG AAGACCCTAT TGAAATTAGA GTTCTACAGA TGCCAAAAGC TGTACTTTCC 4620 

ATCAGGCAGA TGGCAAGCTT ACTGCCTTGA TGCACATCTG GAGCCACTGG AGCTCCTTCC 4680 

TCTCTGGTTC CAGCATTAAG GTGGAGAACT CCATGTAGCT TCTTGTCCTT TCCCCTCAGC 4740 

TGTCTTTGCT TCACAAGGTT TTAGCCCAAA GCAAGAGTGC AATCCCAAAG CCACAGAGAA 4800 

ATGAACTTTC CGCTACCTGG AAGCTTTAAG TGAGTAAATC AGCTTTTCCC CTCTCATTCC 4860 

TAGAGGCACA CACCTCAAAA GTTACTAGGC TGGAGAGACC CTACCTTCCA GTGACCCACT 4920 

CATCCCCCAG CCACGGAGAA GAGGGAAGAC CAAAAAGGGA GAGTGAGAAA GAGGATGAGA 4980 

GGGATGGTCA GCTGTGAGGG GAGGGGGCAA GTGGCCCAGC AAATGTTGAT GCCTCCCTTC 5040 

CCATCTTGCC ACACGGTCTT TTTCTTTTGT AGCACAGCCT CCATTAATAA CTCCTCGGCT 5100 

GAGGATGAAG ATGTAGGCAC CTTTACCCCC AGAGCCAGTT CCTTAATTGG CTGGCTTTCT 5160 

GAGATGCAGA CCACCCTAGA ATCTCATCTA GGTTCACTAG AAGTTAGTTA AATCTTCCTT 5220 

TCTCTGTCTT TCTCTTCATT CCATCCCCCA AACCCACCAA ACACTAAGGG AGAGCTCCCT 5280 

TTGGATGTCT GGGCAGTAAA CCTAGCTCAT TTTTCTAGGA GACCCAGAAG TGACTTCTGA 5340 

GTAGTTATCA CTGTGTCTGC CTCTGTTACA CTGTGCTGCT TTGCTTAAAC AGAAATGCAG 5400 

GCCTGGACAT CTGACTGTGC CTTTATATTC TGAGTGGGGT GCTGCCCCAT GCAAAAAAAT 5460 

CCAGAGAGGT AGTGAGGTGT CAGAGCTAAA CACTTGGTGC TGGGTTTTGT TGATGCTGGT 5520 

ATAATGTGAC ACAGTACAAT TACATGCTAA ATTTTGCATT TTCTCTATAT AACATCTATT 5580 

TTTCCTGATA CTGTGCCTTT GCCATTTTGA TAATGCTATT TTGATTGAGT GAATTTTATT 5640 

TCCTTTGTAT TCCCATAGTG AACAATATAT TAAGGTAGAT GCCCTTTATC TGGGTACTCC 5700 

TGGTAGATTA GCTGTTACAC CTCCCTTCCC TTTTTTACAG TGAACCTGTA TTCAGTTATT 5760 

GTCACTCTGA GAACTCTCCA ATAACAATTT CTTTTCCACA GTTAACAACA CAGCTGTTAC 5820 

ACCTCCCTTC CTTTTTTTCA CAGTGAACCT GTATTCAGCT ATTCTCACTC TGAGAACTCT 5880 

CCAATAACAA TTTCTTTTCC ACAGTTAACA ACAAAGTTCT GTTTTTAAAT GAAGAGATTA 5940 

AGTTCTTTTT AAATGCCTAA AGGCATATTC TGACAACTTT TCTACTTCTT TAACTTTTTT 6000 
GATTTAAGAT ATATGCAAAG CAAATAAATT CAATAAAGCC T 

SEQ ID NO:48 EPC5 PfOt€*n §Wtf?ncg 
Protein Accession !: 8AA86524 

1 11 21 31 41 51 

I I I I I I 

EQPTTSQPET TTPQGLLSDK DDMGRRNAGI DFGSRKASAA QPIPENHDNS KVSDPQPYHE 60 
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DAASGAEKTE ARASLSLMVE SLSTTQEEAI LSVAAEAOVF MNPSHIQLED QEAFSFDLQK 120 

AQSKMESAQD VQTICKEKPS GNVHCTFTAS VLGMTSTTAK GDVYAKTLPP RSLFQSSRKP 180 

DAEEVSSDSE NIPEEGDGSE ELAHGHSSQS LGKFEDEQEV FSESKSFVED LSSSEEELDL 240 

RCLSQALEEP EDAEVFTESS SYVEXYNTSD DCSSSEEDLP LRHPAQALGK PKNQQEVSSA 300 

SNNTPEEQND FMQQLPSRCP SQPIMNPTVQ QQVPTSSVGT SIKQSDSVEP IPPRHPFQPW 360 

VNPKVEQEVS SSPKSMAVEE SISMKPLPPK LLCQPLMNPK VQQUMFSGSE DIAVERVISV 420 

EPLLPRYSPQ SLTDPQIRQI SESTAVEEGT YVEPLPPRCL SQPSERPKFL DSMSTSAEWS 480 

SPVAPTPSKY TSPPWVTPKF EELYQLSAHP ESTTVEEDIS KEQLLPRHLS QLTVGNKVQQ 540 

LSSNFERAAI EADISGSPLP PQYATQFLKR SKVQEMTSRL EKMAVEGTSN KSPIPRRPTO 600 

SFVKFMAQQI FSESSALKRG SDVAPLPPNL PSKSLSKPEV KHQVFSDSGS ANPKGGISSK 660 

MLPMKHPLQS LGRPEDPQKV FSYSERAPGK CSSFKEQLSP RQLSQALRKP EYEQKVSPVS 720 

ASSPKEWRNS KKQLPPKHSS QASDRSKPQP CMSSKGPVNV PVKQSSGEKH LPSSSPFQQQ 780 

VHSSSVNAAA RRSVFESNSD NWFLGRDEAF AIKTKKFSQG SKNPIKSIPA PATKPGKFTI 840 

APVRQTSTSG GIYSKKEDLE SGDGNNNQHA NLSNQDDVEK LFGVRLKRAP PSQKYKSEKQ 900 

DNFTQLASVP SGPISSSVGR GHKIRSTSQG LLDAAGNLTK ISYVADKQQS RPKSESMAKK 960 

QPACXTPGKP AGQQSDYAVS EPVWITMAKQ KQKSFKAHIS VKELKTKSNA GADAETKEPK 1020 

YEGAGSANEN QPKKMFTSSV HKQEKTAQMK PPKPTKSVGF EAQKILQVPA MEKETKRSST 1080 
LPAKFQNPVE PIEPVWFSLA RKKAKAWSHM AEITQ 

SEO ID NO:49 PAB7 DNA SEQUENCE 

Nucleic Acid Accession*: D87742 

Coding sequence: 208-3582 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

till!! 

GCTTTCCTTT CTAAAGTAGA AGAGGATGAT TATCCCTCTG AAGAACTACT AGAGGATGAA 60 

AACGCTATAA ATGCAAAACG GTCTAAAGAA AAAAACCCTG GGAATCAGGG CAGGCAGTTT 120 

GATGTTAATC TGCAAGTCCC TGACAGAGCA GTTTTAGGGA CCATTCATCC AGATCCAGAA 180 

ATTGAAGAAA GCAAGCAAGA AACTAGTATG ATTTTGGATA GTGAAAAAAC AAGTGAGACT 240 

GCTGCCAAAG GGGTCAACAC AGGAGGCAGG GAACCAAATA CAATGGTGGA AAAAGAACGC 300 

CCTCTGGCAG ATAAGAAAGC ACAGAGACCA TTTGAACGAA GTGACTTTTC TGACAGCATA 360 

AAAATTCAGA CTCCAGAATT AGGTGAAGTG TTTCAGAATA AAGATTCTGA TTATCTGAAG 420 

AACGACAACC CTGAGGAACA TCTGAAGACC TCAGGGCTTG CAGGGGAGCC TGAGGGAGAA 480 

CTCTCAAAAG AGGACCATGG GAACACAGAG AAGTACATGG GCACAGAAAG CCAGGGGTCT 540 

GCTGCTGCAG AACCTGAAGA TGACTCGTTC CACTGGACTC CACATACAAG TGTAGAGCCA 600 

GGGCATAGTG ACAAGAGGGA GGACTTACTT ATCATAAGCA GCTTCTTTAA AGAACAACAG 660 

TCTTTGCAGC GGTTCCAGAA GTACTTTAAT GTCCATGAGC TGGAAGCCTT GCTACAAGAA 720 

ATGTCATCAA AACTGAAGTC AGCGCAGCAG GAGAGCCTGC CCTATAATAT GGAAAAAGTC 780 

CTAGATAAGG TCTTCCGTGC TTCTGAGTCA CAAATTCTGA GCATAGCAGA AAAAATGCTT 840 

GATACTCGTG TGGCTGAAAA TAGAGATCTG GG AATGAACG AAAATAACAT ATTTGAAGAG 900 

GCTGCAGTGC TTGATGACAT TCAAGACCTC ATCTATTTTG TCAGGTACAA GCACTCCACA 960 

GCAGAGGAGA CAGCCACACT GGTGATGGCA CCACCTCTAG AGGAAGGCTT GGGTGGAGCA 1020 

ATGGAAGAGA TGCAACCACT GCATGAAGAT AATTTCTCAC GAGAGAAGAC AGCAGAACTT 1080 

AATGTGCAGG TTCCTGAAGA ACCCACCCAC TTGGACCAAC GTGTGATTGG GGACACTCAT 1140 

GCCTCAGAAG TGTCACAGAA GCCAAATACT GAGAAAGACC TGGACCCAGG GCCAGTTACA 1200 

ACAGAAGACA CTCCTATGGA TGCTATTGAT GCAAACAAGC AACCAGAGAC AGCCGCCGAA 1260 

GAGCCGGCAA GTGTCACACC TTTGGAAAAC GCAATCCTTC TAATATATTC ATTCATGTTT 1320 

TATTTAACTA AGTCGCTAGT TGCTACATTG CCTGATGATG TTCAGCCTGG GCCTGATTTT 1380 

TATGGACTGC CATGGAAACC TGTATTTATC ACTGCCTTCT TGGGAATTGC TTCGTTTGCC 1440 

ATTTTCTTAT GGAGAACTGT CCTTGTTGTG AAGGATAGAG TATATCAAGT CACGGAACAG 1500 

CAAATTTCTG AGAAGTTGAA GACTATCATG AAAGAAAATA CAGAACTTGT ACAAAAATTG 1560 

TCAAATTATG AACAGAAGAT CAAGGAATCA AAGAAACATG TTCAGGAAAC CAGGAAACAA 1620 

AATATGATTC TCTCTGATGA AGCAATTAAA TATAAGGATA AAATCAAGAC ACTTGAAAAA 1680 

AATCAGGAAA TTCTGGATGA CACAGCTAAA AATCTTCGTG TTATGCTAGA ATCTGAGAGA 1740 

GAACAGAATG TCAAGAATCA GGACTTGATA TCAGAAAACA AGAAATCTAT AGAGAAGTTA 1800 

AAGGATGTTA TTTCAATGAA TGCCTCAGAA TTTTCAGAGG TTCAGATTGC ACTTAATGAA 1860 

GCTAAGCTTA GTGAAGAGAA GGTGAAGTCT GAATGCCATC GGGTTCAAGA AGAAAATGCT 1920 

AGGCTTAAGA AGAAAAAAGA GCAGTTGCAG CAGGAAATCG AAGACTGGAG TAAATTACAT 1980 

GCTGAGCTCA GTGAGCAAAT CAAATCATTT GAGAAGTCTC AGAAAGATTT GGAAGTAGCT 2040 

CTTACTCACA AGGATGATAA TATTAATGCT TTGACTAACT GCATTACACA GTTGAATCTG 2100 

TTAGAGTGTG AATCTGAATC TGAGGGTCAA AATAAAGGTG GAAATGATTC AGATGAATTA 2160 

GCAAATGGAG AAGTGGGAGG TGACCGGAAT GAGAAGATGA AAAATCAAAT TAAGCAGATG 2220 

ATGGATGTCT CTCGGACACA GACTGCAATA TCGGTAGTTG AAGAGGATCT AAAGCTTTTA 2280 

CAGCTTAAGC TAAGAGCCTC CGTGTCCACT AAATGTAACC TGGAAGACCA GGTAAAGAAA 2340 

TTGGAAGATG ACCGCAACTC ACTACAAGCT GCCAAAGCTG GACTGGAAGA TGAATGCAAA 2400 

ACCTTGAGGC AGAAAGTGGA GATTCTGAAT GAGCTCTATC AGCAGAAGGA GATGGCTTTG 2460 

CAAAAGAAAC TGAGTCAAGA AGAGTATGAA CGGCAAGAAA GAGAGCACAG GCTGTCAGCT 2520 

GCAGATGAAA AGGCAGTTTC GGCTGCAGAG GAAGTAAAAA CTTACAAGCG GAGAATTGAA 2580 

GAAATGGAGG ATGAATTACA GAAGACAGAG CGGTCATTTA AAAACCAGAT CGCTACCCAT 2640 

GAGAAGAAAG CTCATGAAAA CTGGCTCAAA GCTCGTGCTG CAGAAAGAGC TATAGCTGAA 2700 

GAGAAAAGGG AAGCTGCCAA TTTGAGACAC AAATTATTAG AATTAACACA AAAGATGGCA 2760 

ATGCTCCAAG AAGAACCTGT GATTGTAAAA CCAATGCCAG GAAAACCAAA TACACAAAAC 2820 

CCTCCACGGA GAGGTCCTCT GAGCCAGAAT GGCTCTTTTG GCCCATCCCC TGTGAGTGGT 2880 

GGAGAATGCT CCCCTCCATT GACAGTGGAG CCACCCGTGA GACCTCTCTC TGCTACTCTC 2940 

AATCGAAGAG ATATGCCTAG AAGTGAATTT GGATCAGTGG ACGGCCCTCT ACCTCATCCT 3000 

CGATGGTCAG CTGAGGCATC TGGGAAACCC TCTCCTTCTG ATCCAGGATC TGGTACAGCT 3060 

ACCATGATGA ACAGCAGCTC AAGAGGCTCT TCCCCTACCA GGGTACTCGA TGAAGGCAAG 3120 

GTTAATATGG CTCCAAAAGG GCCCCCTCCT TTCCCAGGAG TCCCTCTCAT GAGCACCCCC 3180 

ATGGGAGGCC CTGTACCACC ACCCATTCGA TATGGACCAC CACCTCAGCT CTGCGGACCT 3240 

TTTGGGCCTC GGCCACTTCC TCCACCCTTT GGCCCTGGTA TGCGTCCACC ACTAGGCTTA 3300 
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5 

10 
15 
20 
25 
30 
35 
40 
45 



AGAGAATTTG 
TTTTTACCTG 
ATTCCTGGTA 
GCTGTAAGAG 
ACTAGCCAGG 
TTCATTGGAA 
AAAATCCAAA 
CATTTTTGAG 
AGCTAGAGCG 
GTAGCATATG 
GAAATGCTTT 
GGAGCAATGG 
AAAATGTTTA 
TAGCTCATAA 
ATGAGGCTTG 
CTGGTGGCAC 
TATTTCAAAG 
ATTGTCTATT 
ACCTGATGTT 
AAATGATGTG 
CCTTATCTAT 
AAGAGTATAA 
GTCAGCAACC 
ATTATTCCAA 
TAACTAACCA 
AAACAATGTT 
TACAGTAATA 
AAAGGCTGAT 
TGTAATATTT 
AATACCTTGT 
ACAACTGAAG 
AGTTCATAAG 
TTTAATATCA 
TTAGCCATGT 
CTCTTTAGGA 
AGTGTAGATT 
ATTCAAAATA 
GACATAATTG 
GAGCCAGTCC 
CAAAGCAGGG 
TTCCATCTCT 
AGTTGCTAAA 
GCCATAGTTG 
AGTAATTCGT 
TTATATTCAG 



CACCAGGCGT 
GACACGCACC 
CCCGATTACC 
ACTTACTGCC 
ACTGTTCACA 
AGAAAGTGTA 
AGTTTATTTT 
CCAAACAATT 
TCCTTACAAC 
TAATTGCAAA 
AAGAACATGT 
TGTTTATAAG 
CTAAAAGATC 
AAATTTGTTT 
TGCCATTTGG 
ACTTCCGGCT 
AAGTTTATTT 
TGAGAATGGT 
CCATTGTTTT 
TCATGGCCAT 
CTTTCCCATT 
TGCCATGAGA 
AAGGGTTGAA 
AATTAATATT 
TCTGGAATTG 
TCTTTAAATA 
ATAGCACTCC 
ACTTTTGTTT 
TTGAAACCTA 
AAAAAGGAGC 
ATAGATAGTT 
GAATATAAAA 
AGAATAGAAG 
AAAAATAAGA 
CACAAAACAA 
ATGCCATCTA 
TTAGAGTATT 
AGAAACTGGT 
ATAACTGCTT 
TGCCAATATG 
AAAGTTTCAT 
ATTGTCTTAT 
TTGTAGTTAT 
GGGATGTGGT 
GTCTGAATTA 



TCCACCAGGA 
ATTTAGACCT 
ACCCCCAACC 
GTCAGGCTCT 
GGCTTTAAAA 
CTGTGCATTA 
AAAAGGTTTG 
CAAAAATGTC 
TTTGAAATGT 
ATGATTTAGA 
ATTTCCATTA 
CGTTTTTTTA 
ACTAAACTAT 
ATTAATATTT 
GGAACATGTA 
GCTCCTCCGT 
CCCACTTGTA 
TTTCTGAGAG 
TACCATTCCT 
AAAAGTATAG 
CCTTGCCACT 
AAGAATGATT 
ATCAGTTCTG 
AATTAATATT 
CACCATACTT 
CTCTACAACG 
TTTTAAGGAG 
GCTGCTAGGC 
GTGTATGTCT 
AAAAGCTTCA 
TAGAAAGATA 
ATTCTTCAGG 
AAATTAAGAG 
TTAAGTCACA 
TGCTGAAGTT 
GGAAGGTAAG 
TTTCCCCTCT 
AAGCTGTAAA 
CCTCACATCC 
CAGATGGCAT 
CTATTTTGGA 
TTATTTATGA 
ATCGCCAATG 
ATATTCTGTG 
AAGTTAAGTT 



AGACGGGACC 
TTAGGTTCAC 
CATGGTCCCC 
AGAGATGAGC 
CAGAGCCCAT_ 
TCCATTACAG 
TTGTTAGAAC 
ATTTCTTCCC 
GCAATAAAGA 
ATGTCATGAA 
TCCTATTTTT 
AACTATCTGG 



CCCAAGTGTC 
AACTCAGGCT 
CACCTGTGAA 
TAGCATTCAC 
TGAGTTTACA 
GTAGAAAAAG 
AAATCTTTAA 
GATTTTTGAG 
TAGGACTGTG 
TTTTAGGGGG 
TAAACGTTGG 
AAAGTCTTAT 
TTTCTAAGAA 
TTTCAGATCC 
TATATTCTTC 
TGTCACTGTT 
ATGTGAAACA 
AGGACCTTTG 
AAAAGAGAAT 
GAAAACTCCA 
AATACAACTT 
AATATAATTT 
TAGGAAAGGT 
AAAGCCTTTT 
GATTCCAGTG 
ATCTGATTGC 
AGGGAGTATC 
AGTCATCTCC 
AGCAGCAATA 
GCTGATTTTT 
TCAACTTCAA 
AATCAC 



TGCCTCTCCA 
TTGGCCCAAG 
AGGAATACCC 
CTCCACCTGC 
AAAACTATGA 
TAAAGGATTT 
TAAGCTGCCT 
TAAATAAAAA 
ATACCTGTGT 
AAATATGAAC 
AGTGTACACC 
TCACAAAGAC 
GCTGAAGTTC 
TGTTGACTCA 
CCCAGAACTG 
CTCTACAAGT 
ATGCTTTCTT 
TTAGTAGCAA 
GGTGCACAAC 
AAATTTTAAA 
GAATATAATA 
AGGGTTATAA 
AAATGGGGGG 
TGTTTTTATT 
CCATTACTAC 
CGAACTTCAG 
ACACTAAAAC 
CATTCTTTGA 
GTGATATTTA 
ATTTTCTCTC 
AAAGAAGACA 
TCAATCTATA 
CAGAAGAGCA 
TTGAATTTAC 
CTAATTTTAA 
AAATTAAATC 
TTGGTGATTA 
TAGCTTCTCT 
ACCATTTCTG 
ATCCCTCAGC 
AACTAATTGT 
TTCAGCCTGA 
TTCATTGGAA 
GATAATCACT 



CCCTCGGGGA 
AGAGTACTTT 
ACCACCACCT 
CTCTCAGAGC 
CCTCTGAGGT 
CATTGGCTTC 
TGGCAGTGTG 
TCACCTTTTA 
TTTAGCTAAT 
ATTTCCTGTG 
AGCTGAATAC 
TGTTACGCTA 
TTTGTAGTAA 
TTGGACTGTT 
AAGATGGTGG 
GATGTCTTTT 
TACGATCCTC 
GAGTTGTTTG 
AGAAAAATGA 
ATGTACAGTC 
AAAAGATTGG 
CATGCCCTAG 
GGCGACAGAT 
TAAAAATCAG 
ACTGTCTTTA 
ACATTTTAAT 
TAAAATCATA 
AGTCCTATGA 
ATCGATTAAG 
TTTATACTAA 
ACTCTGTCAA 
TGTCCTCCCG 
TAGGCCACTT 
CTGTCAATAT 
ATGTCATTTA 
TATTTTTAAA 
TTCTGTATCT 
GAGAAGTTGT 
CAGCAAACCC 
CAAATCACTT 
GTCTGGATTT 
AAGCATTTCT 
AGTAAATTTA 
CATTTTCTCG 



3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 



50 
55 
60 
65 
70 
75 
80 



SEQ ID NO:S0 PA B7 Protein sequence 
Protein Accession*: BAA13448 



1 
I 

AFLSKVEEDD 
IEESKQETSM 
KIQTPELGEV 
AAAEPEDDSF 
MSSKLKSAQQ 
AAVLDDIQDL 
NVOVPEBPTH 
EPASVTPLEN 
IFLWRTVLW 
NMILSDEAIK 
KDV1SMNASE 
AELSEQIKSF 
ANGEVGGDRN 
LEDDRNSLQA 
ADEKAVSAAE 
EKREAANLRH 
GECSPPLTVE 



FGPRPLPPPF 
IPGTRLPPPT 



11 
I 

YPSEELLEDE 
ILDSEKTSET 
FQNKDSDYLK 
HWTPHTSVEP 
ESLPYNMEKV 
IYFVRYKHST 
LDQRVIGDTH 
AILLIYSFMF 
KDRVYQVTEQ 
YKDKIKTLEK 
FSEVQIALNE 
EKSQKDLEVA 
EKMKNQIKOM 
AKAGLEDECK 
EVKTYKRRIE 
KLLELTQKMA 
PPVRPLSATL 
SPTRVLDEGK 
GPGMRPPLGL 
HGPQEYPPPP 



21 
I 

NAINAKRSKE 
AAKGVNTGGR 
NDNPEEHLKT 



LDKVFRASES 
AEETATLVMA 
ASEVSQKPNT 
YLTKSLVATL 
QISEKLKTIM 
NQEILDDTAK 
AKLSEEKVKS 
LTHKDDNIKA 
MDVSRTQTAI 
TLRQKVEILN 
EMEDELQKTE 
MLQEEPVIVK 
NRRDMPRSEF 
VNMAPKGPPP 
REFAPGVPPG 
AVRDLLPSGS 



31 
I 

KNPGNQGRQF 
EPNTMVEKER 
SGLAGEPEGE 
IISSFFKEQQ 
QILSIAEKML 
PPLEEGLGGA 
EKDLDPGPVT 
PDDVQPGPDF 
KENTELVQKL 
NLRVMLESER 
ECHRVQEENA 
LTNCITQLNL 
SWEEDLKLL 
ELYQQKEMAL 
RSFKNQIATH 
PMPGKPNTQN 
GSVDGPLPHP 
FPGVPLMSTP 
RRDLPLHPRG 
RDEPPPASQS 



41 

I 

DVNLQVPDRA 
PLADKKAQRP 



SLQRFQKYFN 
DTRVAENRDL 
MEEMQPLHED 
TEDTPMDAID 
YGLPWKPVFI 
SNYEQKIKES 
EQNVKNQDLI 
RLKKKKEQLQ 
LECESESEGO 
QLKLRASVST 
QKKLSQEEYE 
EKKAHENWLK 
PPRRGPLSQN 
RWSAEASGKP 
MGGPVPPPIR 
FLPGHAPFRP 
TSCDCSQALK 



51 

I 

VLGTIHPDPE 
FERSDFSDSI 
KYMGTESQGS 
VHELEALLQE 
GMNENNIFEE 
NFSREKTAEL 
ANKQPETAAE 
TAFLGIASFA 
KKHVQETRKQ 
SENKKSIEKL 
QEIEDW5KLH 
NKGGNDSDEL 
KCNLEDQVKK 
RQEREHRLSA 
ARAAERAIAE 
GSFGPSPVSG 
SPSDPGSGTA 
YGPPPQLCGP 
LGSLGPREYF 
QSP 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



SEQ ID NO-.51 PAB9 ONA SEQUENCE 

Nuclefc Acid Accession!: NM_006457 

Coding sequence: 84-1874 (underftned swjuences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

AGACTGAGGC GGAGGCAGCC CCGCGCCGCG CCGGACCCGA GCATATTTCA TTTTCTGTCA 

321 
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5 

10 

15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TTGGACTTTG 
CTTGGGGTTT 
TAAAAGATGG 
TTGATGGAAT 
GTACAGGCTC 
TTCCTGTTCA 
CTGTGTCCAA 
GTTCTGTGTC 
CCCATGCGAC 
TGTTCGCTGC 
CACTGAGCGC 
GTTCCGAGAC 
AACAGCAAAA 
TACCCACTCA 
CAAGAACTGG 
AACATTTGAA 
CTCCGCAGTT 
CAACCTCTGG 
GATCCACTGG 
CTGGAAGAAT 
TGGGACAAAC 
CAGGGAAACG 
TGGCACTGGG 
TGGCCTACAT 
AATTCTTTGC 
CGTTGAAACA 
GGAACAATGT 
TCTTTGGTAC 
AAGCTCTGGG 
TGGAAGGTCA 
CTGTGAATTT 
AAATTAAAAT 
AGTGGCCCT6 
CATAAAGTAA 
AAATAAGCTT 
AGTGAAGAAT 
TGTTAGGTAG 
AACAGAATTA 
TTAAACAGAG 
GCGCGGTGGC 
GAGGTCAGGA 
ACAAAAATTA 
GCACGAGAAT 
CACTCCAGCC 
TATTTTTGCC 
TGTGTCATGC 
GCTACCATAT 
TTCATTGTGT 
GTAAAGATTT 
AATAGAGGGC 
GGGCGGATCA 
CTACTAAAAA 
GGGAGGCTGA 
CACACCACTG 



AGCCATTAGA 
CCGGCTCCAG 
CGGCAAGGCA 
AAATGCACAA 
TTTGAATATG 
AAAGGGAGAA 
AGTCACTTCC 
TTCACCAAAA 
CACCTCATCA 
ATCTGGACTG 
TGGTAAAACT 
TTCTCAGGAG 
TGGCCCACCA 
CAGTGATGCC 
AACAACTCAG 
AGAATCTGAA 
GGCTTCCTTG 
CAGACCAGGG 
CGTCATCAAG 
CTCAAACAGC 
CCAGCCAAGT 
AACTCCGATG 
GAAATCTTGG 
TGGATTTGTA 
CCCTGAATGT 
AACTTGGCAT 
TTTTCACTTG 
TATATGCCAT 
CTACACCTGG 
GACCTTTTTC 
TTGAAAGTCA 
TACTAATTAA 
AAGGAATAAA 
AGAGACGGTT 
TATAAAAACC 
TTAATTTTAG 
TTATGAGTAA 
TTGTATTTAA 
AATTTTATCA 
TCACGCCTGT 
GTTTGAGATC 
GCCGGACGCA 
CACTTGAACC 
TGGGTGACAG 
TTACAGTGGA 
CAGTAAGAGA 
AGCTTATAAG 
ATGCTTCATC 
AATTAAATAA 
CAGGTGTGGT 
TGAGGTCAAG 
TACAAAAATG 
GGCAGGAAAA 
CACTCCAGCC 



ACCATGAGCA 
GGCGGTAAGG 
GCCCAGGCAA 
GGAATGACTC 
ACTCTGCAAA 
CCTAAAGAAG 
ACAAACAACA 
GTCACATCCA 
CATGCTTCCC 
CATGCTAATG 
GCAGTTAATG 
CTAGCAGAGG 
AGAAAACACA 
AGCAAGAAGA 
TCTCGCTCTT 
GCCGATAATA 
GTAGCTTCCA 
GTTACCAGCC 
TCACCAAGCT 
GCTACTTACT 
GACCAGGACA 
TGCGCCCATT 
CACCCAGAAG 
GAGGAGAAAG 
GGTCGATGCC 
GTTTCCTGTT 
GAGGATGGTG 
GGATGTGAAT 
CATGACACTT 
TCCAAGAAGG 
ACAGTTCAGG 
TTTTTAGATT 
TTCCAGCTTT 
TGGCATTTAT 
AATTTCCTGA 
AATAAATAAT 
ATCTGCAAAA 
AAAAAAACTA 
GTAATAGGTG 
AATCCCAGCA 
AGCCTGGCCA 
GTGGCACGCG 
CGGGAGGGAG 
AGTGAGACTC 
TCATTCTAGT 
TGTTATATTC 
TCTCAAATTT 
ACCTATATTA 
TTTTGGCCTC 
GGCTCACGCC 
AGATCAAGAT 
AGCTGGGCAT 
TTCTTGAACC 
TGGTGACAGA 



ACTACAGTGT 
ATTTCAACAT 
ATGTAAGAAT 
ATCTTGAAGC 
GAGCATCTGC 
TAGTTAAACC 
TGGCCTACAA 
TCCCATCACC 
CTTCACCCGT 
CCAATCTTAG 
TCCCACGGCA 
GACAGAGAAG 
TTGTGGAGCG 
GACTGATTGA 
TCCGAATCCT 
CAAAGAAGGC 
CACGGAGCAT 
TCACAACTGC 
GGCAACGGCC 
CAGGATCAGT 
CTTTAGTGCA 
GTAACCAGGT 
AATTCAACTG 
GAGCCCTGTA 
AAAGGAAGAT 
TTGTGTGTGT 
AACCCTACTG 
TTCCCATAGA 
GCTTTGTATG 
ACAAGCCCCT 
AGAAGAGAAG 
CAATATTTAT 
AAAAACCAAG 
TATTACTTTT 
TGGACTATTA 
CCAATCTGAA 
GGCAATGAAA 
ATACTTATCT 
TCAGTTTTTA 
CTTTGGGAGG 
ACATGGTGAA 
CCTGTAATCC 
AGGTTGCAGT 
CGTCTCCAAA 
AGGAAAGGAC 
TTTTCTTATT 
TTGCCTTTTA 
GGCAAATTCC 
TCATAGTTTT 
TGTGATCCCA 
CATCCTGGCC 
GGTGGGGCGT 
CAGGAGACGG 
GCAAGACTCC 



GTCACTGGTT 
GCCTCTGACA 
AGGCGATGTG 
CCAGAATAAG 
TGCACCCAAG 
TGTGCCCATT 
TAAGGCACCA 
ATCGTCTGCC 
GGCTGCCGTC 
TGCTGACCAG 
GCCCACAGTC 
AGGATCCCAG 
CTATACAGAG 
GGATACTGAA 
TGCCCAGATC 
AAATAACTCT 
GCCCGAGAGC 
AGCTGCCTTC 
AAACCAAGGA 
GGCACCAGCC 
AAGAGCTGAG 
CATCAGAGGA 
CGCTCACTGC 
TTGTGAGCTG 
CCTTGGAGAA 
AGCCTGTGGA 
TGAGACTGAT 
AGCTGGTGAC 
CTCAGTGTGT 
GTGTAAGAAA 
GAATTTGAAG 
ATGGAGTTTT 
TCTGAGGAAA 
TCCTGTATTT 
AATTCATCTT 
ATAATTATAC 
ATGCCTTAAA 
TTAAAATAGT 
AAAAATTGCT 
CCAAGGTGGG 
ACCCCATCTC 
CAGCTACTCA 
GAGCCAAGAT 
AAAAAACTTT 
AATAAGATTT 
TCTTCCCCAC 
CTAAAATGTG 
ATTTTTTCCC 
CTCTCTCTTT 
GCACTTTGGG 
AACATGGTGA 
GCCTGTAGTC 
AAGTTGCAGT 
GGCTCTT 



GGCCCAGCTC 
ATCTCTAGTC 
GTTCTCAGCA 
ATTAAGGGTT 
CCTGAGCCGG 
ACATCTCCTG 
CGGCCTTTTG 
TTCACCCCAG 
ACTCCTCCCC 
TCTCCATCTG 
ACCAGCGTGT 
GGTGACAGTA 
TTTTATCATG 
GACTGGCGTC 
ACTGGGACTG 
CAGGAGCCTT 
CTGGACAGCC 
AAGCCTGTAG 
GTACCTTCCA 
AACTCAGCTT 
CACATTCCAG 
CCATTCTTAG 
AAAAATACAA 
TGCTATGAGA 
GTCATCAATG 
AAGCCCATTC 
TATTATGCCC 
ATGTTCCTGG 
TGTGAAAGTT 
CATGCTCATT 
AGAAAAAGGA 
GAAAAATAAT 
TATTTGGCTT 
TATGCCCATA 
AGAATAAATT 
CTTCTTTCCT 
TTTTATCAAT 
AAATAGGATT 
TGTAGGCTGA 
TGGACCACAT 
TACTAAAAAT 
AGAGGCTGAG 
CGTACCACTG 
GCTTGTATAT 
TTTATCAAAA 
CCAAAAATAA 
ATTGTTTCTG 
TTGCGCTAAG 
AAAGAGAATA 
AGGCCAAGAC 
AACCCTGTCT 
CCATGTACTT 
GAGCTGAGAT 



120 
180 
240 
300 
360 
420 
480 
S40 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
I860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 



SEQ IP N0:S2 PAB9 Protein sequence 
Protein Accession #: NP.006448 



1 MSNYSVSLVG 
61 MTHLEAQNKI 
121 NNMAYNKAPR 
181 ANANLSADQS 
241 KHIVERYTEP 
301 DNTKKANNSQ 
361 PSWQRPNQGV 
421 AHCNQVIRGP 
481 RCQRKILGEV 
541 CEFPIEAGDM 



11 
I 

PAPWGFRLQG 
KGCTGSLNMT 
PFGSVSSPKV 
PSALSAGKTA 
YHVPTHSDAS 
EPSPQLASLV 
PSTGRISNSA 
FLVALGKSWH 
INALKQTWHV 
FLEALGYTWH 



21 
I 

GKDFNMPLTI 
LQRASAAPKP 
TSIPSPSSAF 
VKVPRQPTVT 
KKRLIEDTED 
ASTRSMPESL 
TYSGSVAPAN 
PEEFNCAHCK 
SCFVCVACGK 
DTCFVCSVCC 



31 
I 

SSLKDGGKAA 
EPVPVQKGEP 
TPAHATTSSH 
SVCSETSOEL 
WRPRTGTTQS 
DSPTSGRPGV 
SAW^JTQPSD 
NTMAYIGFVE 
PIRNNVFHLE 
ESLEGQTFFS 



41 

i 

QANVRIGDW 
KEWKPVPIT 
ASPSPVAAVT 
AEGQRRGSQG 
RSFRILAQIT 
TSLTTAAAFK 
QDTLVQRAEH 
EKGALYCELC 
DGEPYCETDY 
KKDKPLCKKH 



51 
I 

LSIDGINAQG 
SPAVSKVTST 
PPLFAASGLH 
DSKQQNGPPR 
GTEHLKESEA 
PVGSTGV1KS 
IPAGKRTPMC 
YEKFFAPECG 
YALFGTICHG 
AHSVNF 



60 
120 
180 
240 
300 
360 
420 
480 
540 



SEO ID NO:53 PBH7 DMA SEQUENCE 

Nucleic Add Accession #: AA431407 

Coding sequence: 1-884 (underlined sequences correspond to start and stop codorts) 

1 11 21 31 41 51 

I I I I I I 

ATGGCCAACT GTAAAATGAC CAAAAGCATC AGGTTCCCTG CCCTGGAGCA CTGCTATACT 
GGCGGGGAGG TCGTGTTGCC CAAGGATCAG GAGGAGTGGA AAAGACGGAC GGGCCTTCTG 
CTCTACGAGA ACTATGGGCA GTCGGAAACG GGACTAATTT GTGCCACCTA CTGGGGAATG 

322 



60 
120 
180 
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AAGATCAAGC CGGGTTTCAT GGGGAAGGCC ACTCCACCCT ATGACGTCCA GTTTCATATG 240 

GAGGCCTCAG TTGAAAACTG CATTATTGTG AGCATGAACA CCGCTGACCC TGGCAGCCAG 300 

GGCATCACAC ACAGCCTCTT GCTACAGGTC ATTGATGACA AGGGCAGCAT CCTGCCACCT 360 

AACACAGAAG GAAACATTGG CATCAGAATC AAACCTGTCA GGCCTGTGAG CCTCTTCATG 420 

TGCTATGAGG GTGACCCAGA GAAGACAGCT AAAGTGGAAT GTGGGGACTT CTACAACACT 480 

GGGGACAGAG GAAAGATGGA TGAAGAGGGC TACATTTGTT TCCTGGGGAG GAGTGATGAC 540 

ATCATTAATG CCTCTGGGTA TCGCATCGGG CCTGCAGACG TTGAAAGCGC TTTGGTGGAG 600 

CACCCAGCGG TGGCGGAGTC AGCCGTGGTG GGCAGCCCAG ACCCGATTC6 AGGGGAGGTG 660 

GTGAAGGCCT TTATTGTCCT GACCCCACAG TTCCTGTCCC ATGACAAGGA TCAGCTGACC 720 

AAGGAACTGC AGCAGCATGT CAAGTCAGTG ACAGCCCCAT ACAAGTACCC AAGGAAGGTG 780 

GAGTTTGTCT CAGAGCTGCC AAAAACCATC ACTGGCAAGA TTGAACGGAA GGAACTTCGG 840 

AAAAAGGAGA CTGGTCAGAT GTAATCGGCA GTGAACTCAG AACGCACTCC ACACCTGAGG 900 

CAAATCCCTG GCCACTTTAG TCTCCCCACT ATGGTGAGGA CGAGGGTGGG GCATTGAGAG 960 

TGTTGATTTG GGAAAGTATC AGGAGTGCCA TGATTCCAAT GTTTTCCTTC TTTTAAATTA 1020 

AATTCAGTTG CTCTGCTTCC TCCAAGTCCT CTGTATCTTT AGAATTTCCC AGGTGAGCAC 1080 
TCATAACGCA AGTAATAAAA TACTGATATC AACAA 

Protdn Accession #: FGENESH predicted 

1 U 21 31 41 51 

I I I I I I 

MANCKMTKSI RFPALEHCYT GGEWLPKDQ EEWKRRTGLL LYENYGQSET GLICATYWGM 60 

KZKPGFHGKA TPPYDVQFHM EASVENCIIV SMNTADPGSQ GITHSLLLQV IDDKGSILPP 120 

NTEGNIGIRI KPVRPVSLFM CYEGDPEKTA KVECGDFYNT GDRGKMDEEG YICFLGRSDD 180 

IINASGYRIG PAEVE SALVE HPAVAESAW GSPDPIRGEV VKAPIVLTPQ FLSHDKDQLT 240 
KELOQHVKSV TAPYKYPRKV EPVSELPKTI TGKIERKELR KKETGQM 

SEQ ID NO:55 PBJ5 DNA SEQUENCE 

Nucleic Acid Accession*: AF388200 

Coding sequence: 33-137 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I 1 I I 

GAGAGAGGGA GGCAGAAGAG GAAGTCAGAG CGATGTGCTG TGAAATCTAC TACCGTTTGC 60 

TGGTTTTGAA AATGGAGAAA AAGAGTGAGG AACTGAGAAA CATGGATGGC CTTGGGAACG 120 

TGGAAAAGGG TCACTGAAAT GGGACGAC AT GA ACTCAAGG AGGCTATTTA TGACCATGTC 180 

ATTTGCAACA TGAAGAAAGC TTATCTGGAG TGAAAGTAAA TGAGACCAAC AGAGATAAGA 240 

GACCCGGAGA AATCCTGGTT ACACTGCTTG AATCCTGTCA GTCCTATACT GGAGTCCTGT 300 

TAATACAAAA TAATAGTAAT AATCCCTCTG TTTCTTATGT TTATGCCAAC TTCAACAAAA 360 

AGAAACTTGA CTAAGAGACA ATATAAGAAC TTAATGTGTA ATTAAGAAAG AACTCTCCAC 420 

CACGGGGAAT GTGAAAGGTA TATGAGTCCC TTTTCACGAT GCGATGTCAT GTCTTTTAAA 480 
TAAGCCATAC TTTATGTTCA ATAAAAAGAG AATAAGCAGG A 

SEQ tD NO:56 PBJ5 Protein sequence 
Protein Accession i: AAK833S2 

1 11 21 31 41 51 

I I I I I I 

MCCEIYYRLL VLKMEKKSEE LRNMDGLGNV EKGH 

SEQ ID N0:57 PBJ7 DNA SEQUENCE 

Nucleic Acid Accession #: AA876910 

Coding sequence: 1-2064 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGACAGTT GCCTGCAACA TATGAGAGAC CTACTTTACC TCCTTCAGGA GCTCAGGTGT 60 

TTAAATCCAG CTACACTACT CCCTGATCCA GACTCCACTA CTCCTGTTCA TGACTGTCAG 120 

GATCTGTTGG AAACTACCAA AACTGGCCAA CCTGATCTTC AAGATGTGCC CCTAGAAAAG 180 

GCAGATGCCA CTGTGTTCAC AGATGGTAGC AGCTTCCTCG AGCAGGGAGA ACGAAAAGCT 240 

GTTTCTTTTC CACAGCCAGA TCTGCCTGAC AATCCCACAT ACTCAACAGA AGAAGAAAAA 300 

CTGGCTTCAG ATGTTGGAGC AAATAAAAAT CAGGAAGGAC GTGTATTCGC AAACACTACT 360 

TGGAGGGCCG GTACCTCCAA GGAAGTCTCC TTTGCAGTTG ATTTATGTGT ACTGTTCCCA 420 

GAGCCAGCTC GTACCCATGA AGAGCAACAT AATTTGCCGG TCATAGGAGC AGGAAGTGTC 480 

GACCTTGCAG CAGGATTTGG ACACTCTGGG AGCCAAACTG GATGTGGAAG CTCCAAAGGT 540 

GCAGAAAAAG GGCTCCAAAA TGTTGACTTT TACCTCTGTC CTGGAAATCA CCCTGACGCT 600 

AGCTGTAGAG ATACTTACCA GTTTTTCTGC CCTGATTGGA CATGTGTAAC TTTAGCCACC 660 

TACTCTGGGG GATCAACTAG ATCTTCAACT CTTTCCATAA GTCGTGTTCC TCATCCTAAA 720 

TTATGTACTA GAAAAAATTG TAATCCTCTT ACTATAACTG TCCATGACCC TAATGCAGCT 780 

CAATGGTATT ATGGCATGTC ATGGGGATTA AGACTTTATA TCCCAGGATT TGATGTTGGG 840 

ACTATGTTCA CCATCCAAAA GAAAATCTTG GTCTCATGGA GCTCCCCCAA GCCAATCGGG 900 

CCTTTAACTG ATCTAGGTGA CCCTATATTC CAGAAACACC CTGACAAAGT TGATTTAACT 960 

GTTCCTCTGC CATTCTTAGT TCCTAGACCC CAGCTACAAC AACAACATCT TCAACCCAGC 1020 

CTAATGTCTA TACTAGGTGG AGTACACCAT CTCCTTAACC TCACCCAGCC TAAACTAGCC 1080 

CAAGATTGTT GGCTATGTTT AAAAGCAAAA CCCCCTTATT ATGTAGGATT AGGAGTAGAA 1140 

GCCACACTTA AACGTGGCCC TCTATCTTGT CATACACGAC CCCGTGCTCT CACAATAGGA 1200 

GATGTGTCTG GAAATGCTTC CTGTCTGATT AGTACCGGGT ATAACTTATC TGCTTCTCCT 1260 

TTTCAGGCTA CTTGTAATCA GTCCCTGCTT ACTTCCATAA GCACCTCAGT CTCTTACCAA 1320 

GCACCCAACA ATACCTGGTT GGCCTGCACC TCAGGTCTCA CTCGCTGCAT TAATGGAACT 1380 
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GAACCAGGAC CTCTCCTGTG CGTGTTAGTT CATGTACTTC CCCAGGTATA TGTGTACAGT 1440 

GGACCAGAAG GACGACAACT CATCGCTCCC CCTGAGTTAC ATCCCAGGTT GCACCAAGCT 1500 

GTCCCACTTC TGGTTCCCCT ATTGGCTCGT CTTAGCATAG CTGGATCAGC AGCCATTGGT 1560 

ACGGCTGCCC TGGTTCAAGG AGAAACTGGA CTAATATCCC TGTCTCAACA GGTGGATGCT 1620 

GATTTTAGTA ACCTCCAGTC TGCCATAGAT ATACTACATT CCCAGGTAGA GTCTCTGGCT 1680 

GAAGTAGTTC TTCAAAACTG CCGATGCTTA GATCTGCTAT TCCTCTCTCA AGGAGGTTTA 1740 

TGTGCAGCTC taggagaaag TTGTTGCTTC TATGCCAATC AATCTGGAGT CATAAAAGGT 1800 

ACAGTAAAAA AAGTTCGAGA AAATCTAGAT AGGCACCAAC AAGAACGAGA AAATAACATC 1860 

CCCTGGTATC AAAGCATGTT TAACTGGAAC CCATGGCTAA CTACTTTAAT CACTGGGTTA 1920 

GCTGGACCPC TCCTCATCCT ACTATTAAGT TTAATTTTTG GGOCTTGTAT ATTAAATTCG 1980 

TTTCTTAATT TTATAAAACA ACGCATAGCT TCTGTCAAAC TTACGTATCT TAAGACTCAA 2040 
TATGACACCC TTGTTAATAA CTGA 

SEP tO NO:58 PBJ7 Protein sequence 
Protein Accession «: FGENESH predicted 

1 11 21 31 41 51 

I I I I 1 I 

MDSCLQHMRD LLYLLQELRC LNPATLLPDP DSTTPVHDCQ DLLETTKTGQ PDLQDVPLEK 60 

ADATVFTDGS SFLEQGERKA VSFPQPDLPD NPTYSTEEEK LASDVGANKN QEGRVFANTT 120 

WRAGTSKEVS PAVDLCVLFP EPARTHEEQH NLPVIGAGSV DLAAGFGHSG SQTGCGSSKG 180 

AEKGLQNVDF YLCPGNHPDA SCRDTYQFFC PDWTCVTLAT YSGGSTRSST LSISRVPHPK 240 

LCTRKNCNPL TITVHDPNAA QWYYGHSWGL RLYIPGFDVG TMFTIQKXIL VSWSSPKPIG 300 

PLTDLGDPIF QKHPDKVDLT VPLPFLVPRP QLQQQHLQPS LMSILGGVHH LLNLTQPKLA 360 

QDCWLCLKAK PPYYVGXGVE ATLKRGPLSC HTRPRALTIG DVSGNASCLI STGYNLSASP 420 

FOATCNQSLL TSISTSVSYQ APNNTWLACT SGLTRCINGT EPGPLLCVLV HVLPQVYVYS 480 

GPEGRQLIAP PELHPRLHQA VPLLVPLLAG LSIAGSAAIG TAALVQGETG LISLSQQVDA 540 

DFSNLQSAID ILHSQVESLA EWLQNCRCL DLLFLSQGGL CAALGESCCF YANQSGVIKG 600 

TVKKVRENLD RHQQERENNI PWYQSHFNWN PWLTTLITGL AGPLLILLLS LIFGPCILNS 660 
FLNFIKQRIA SVKLTYLKTQ YDTLVNN 

SEO ID NO:59 PC01 0NA SEQUENCE 

Nucteic Acid Accession!: NM_0 19005 

Coding sequence: 182-1885 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I f 1 I I I 

TGATGGTGGA AATTTCTTGA AACCGCTCTC GTAATTTGCC ACGTGCTGTT GCAAATATTC 60 

TGGTGAATGA ACACAGAATC AGCATGGCTT TCCTTTGCTG AGAAATCACT GATGGGAAGT 120 

GAGACTTGTT AAACTTGAAA GTGAATGGAC CTGAGTGGAC CCTTTGATCA CATCAGTAAA 180 

CATGAGCGGT ACCAAACCTG ATATTTTATG GGCACCACAC CATGTTGATA GATTTGTTGT 240 

GTGTGACTCA GAACTAAGTC TTTATCATGT GGAATCTACT GTGAATTCAG AACTCAAAGC 300 

TGGATCTTTA CGTTTATCTG AAGACTCTGC AGCTACATTA CTGTCAATAA ATTCAGATAC 360 

ACCCTATATG AAATGTGTTG CCTGGTATCT TAATTATGAT CCTGAATGTC TGCTGGCAGT 420 

TGGACAAGCA AATGGTCGAG TTGTACTTAC AAGCCTTGGT CAAGATCATA ACTCAAAGTT 480 

CAAAGATTTG ATAGGAAAAG AGTTTGTTCC AAAACATGCA CGACAATGTA ATACCCTTGC 540 

CTGGAATCCA CTGGATAGTA ACTGGCTAGC TGCTGGTTTA GATAAGCACA GAGCTGACTT 600 

TTCAGTGCTA ATATGGGATA TCTGCAGCAA ATATACTCCT GATATAGTTC CCATGGAAAA 660 

AGTGAAACTT TCAGCAGGTG AAACTGAAAC AACATTATTA GTAACAAAAC CACTTTATGA 720 

GTTAGGACAG AATGATGCTT GTCTGTCTCT TTGTTGGCTT CCACGAGACC AGAAACTTCT 780 

CCTTGCTGGT ATGCATCGTA ACCTAGCTAT ATTTGATCTT CGGAATACAA GCCAAAAGAT 840 

GTTCGTAAAT ACAAAAGCTG TTCAGGGTGT GACGGTAGAC CCATATTTCC ACGATCGTGT 900 

TGCTTCCTTC TATGAAGGTC AGGTTGCAAT ATGGGATCTT AGAAAATTTG AGAAGCCAGT 960 

TTTGACATTG ACTGAGCAAC CAAAACCCTT AACAAAAGTA GCATGGTGTC CCACTAGGAC 1020 

TGGTCTACTT GCCACTTTAA CAAGGGATAG TAATATTATT AGATTGTATG ATATGCAGCA 1080 

TACACCCACT CCCATTGGGG ATGAAACTGA ACCCACAATA ATTGAAAGAA GTGTGCAACC 1140 

TTGTGACAAT TACATTGCTT CCTTTGCGTG GCATCCAACA AGTCAAAATC GAATGATAGT 1200 

TGTAACTCCC AACCGAACAA TGTCAGACTT CACTGTTTTT GAAAGGATAT CTCTTGCCTG 1260 

GAGCCCAATT ACATCTTTAA TGTGGGCTTG TGGTCGTCAT TTATATGAAT GTACGGAAGA 1320 

AGAAAATGAT AATTCTTTAG AAAAAGATAT AGCAACGAAG ATGCGTCTTC GGGCTTTATC 1380 

AAGGTATGGA CTTGATACAG AGCAGGTGTG GAGGAACCAC ATTTTAGCTG GAAATGAAGA 1440 

TCCACAGCTC AAGTCACTCT GGTATACTCT GCACTTTATG AAGCAATACA CAGAAGATAT 1500 

GGATCAGAAA TCTCCAGGCA ACAAAGGATC ATTGGTTTAT GCAGGAATTA AATCAATTGT 1560 

AAAGTCATCG TTGGGAATGG TGGAAAGCAG CAGACATAAT TGGAGTGGGT TGGATAAGCA 1620 

AAGTGATATT CAAAACTTAA ATGAAGAGAG AATCTTAGCT TTACAGCTTT GTGGGTGGAT 1680 

AAAGAAAGGA ACGGATGTAG ACGTGGGGCC ATTTTTGAAC TCCCTTGTAC AAGAAGGGGA 1740 

ATGGGAAAGA GCTGCTGCTG TGGCATTGTT CAACTTGGAT ATTCGCCGAG CAATCCAAAT 1800 

CCTGAATGAA GGGGCATCTT CTGAAAAAGG CAGGAGATCT GAATCTCAAT GTGGTAGCAA 1860 

TGGCTTTATC GGGTTATACG GATGAGAAGA ACTCCCTTTG GAGAGAAATG TGTAGCACAC 1920 

TGCGATTACA GCTAAATAAC CCGTATTTGT GTGTCATGTT TGCATTTCTG ACAAGTGAAA 1980 

CAGGATCTTA CGATGGAGTT TTGTATGAAA ACAAAGTTGC AGTACGTGAC AGAGTGGCAT 2040 

TTGCTTGTAA ATTCCTTAGT GATACTCAGA TACATCGAAA AGTTGACCAA TGAAATGAAA 2100 

GAGGCTGGAA ATTTGGAAGG AATTTTGCTT ACAGGCCTTA CTAAAGATGG AGTGGACTTA 2160 

ATGGAGAGTT ATGTTGATAG AACTGGAGAT GTTCAAACAG CAAGTTACTG TATGTTACAG 2220 

GGTTCACCTT TAGATGTTCT TAAAGATGAA AGGGTTCAGT ACTGGATTGA GAATTATAGA 2280 

AATTTATTAG ATGCCTGGAG GTTTTGGCAT AAACGAGCTG AATTTGATAT TCACAGGAGT 2340 

AAGTTGGATC CCAGTTCCAA GCCTTTAGCA CAAGTTTTTG TGAGTTGCAA TTTCTGTGGC 2400 

AAGTCAATCT CCTACAGCTG TTCAGCTGTG CCTCATCAGG GCAGAGGTTT TAGTCAGTAT 2460 

GGTGTGAGTG GCTCACCAAC GAAATCTAAA GTCACAAGTT GTCCTGGCTG TCGAAAACCA 2520 

CTTCCTCGAT GTGCGCTTTG TCTCATTAAT ATGGGAACAC CAGTTTCTAG CTGTCCTGGA 2580 
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GGAACCAAAT CAGATGAAAA AGTGGACTTG AGCAAGGACA AAAAATTAGC CCAATTTAAC 2640 

AACTGGTTTA CATGGTGTCA TAATTGCAGG CACGGTGGAC ATGCTGGACA TATGCTTAGT 2700 

TGGTTCAGGG ACCATGCAGA GTGCCCTGTG TCTGCATGCA CGTGTAAATG TATGCAGTTG 2760 

c GATACAACGG GGAATCTGGT ACCTGCAGAG ACTGTCCAGC CATAAAATGT TACCAOCTTA 2820 

5 AGAGAACCCT TCAAGTGTGG AGCTTTCTAG TAGGTGTCCT TCATAGCTCA GAAACATACC 2880 

TCAGAACAAG CCATTCATGA CTTACCTGTA ATGGGAAAAT AAATCATTCT ATCAGAAAAA 2940 
AAAAAAAAAA AAAAAAAAAA 

SEQ ID NO:60 PCQ1 Proton sequence 
10 Protein Accession!: NP_061878 

1 11 . 21 31 41 51 

I I I I I I 

MSGTKPDILW APHHVDRPW CDSELSLYHV ESTVNSELKA GSLRLSEDSA ATLLSINSDT 60 

15 PYMKCVAWYL NYDPECLLAV GQANGRWLT SLGQDHNSKP KDLIGKEFVP KHARQCNTLA 120 

WNPLDSNWLA AGLDKHRADF SVLIKDICSK YTPDIVPMEK VKLSAGETET TLLVTKPLYE 180 

LGQNDACLSL CWLPRDQKLL LAGMHRNLAI FDLRNTSQKM FVNTKAVQGV TVDPYFHDRV 240 

ASFYEGQVAI WDLRKFEKPV LTLTEQPKPL TKVAWCPTRT GLIATLTRDS NIIRLYDMQH 300 

TPTPIGDETE PTIIERSVQP CDNYIASFAW HPTSQNRMIV VTPNRTMSDF TVFERISLAW 360 

20 SPITSLMWAC GRHLYECTEE ENDNSLEKDI ATKMRLRALS RYGLDTEQVW RNHILAGNED 420 

PQLKSLWYTL HFMKQYTEOM DQKSPGNKGS LVYAGIKSIV KSSLGMVESS RHNWSGLDKQ 480 

SDIQNLNEER ILALQLCGWI KKGTDVDVGP PLNSLVQEGE WERAAAVALF NLDIRRAIQI 540 
LNEGASSEKG RRSESQCGSN GFIGLYG 

25 SEQ ID NO:61 PDG3 DNA SEQUENCE 

Nudeic Add Accession*: U42359 

Coding sequence: 563-775 (underlined sequences correspond to start and stop codons) 

- * 1 11 21 31 41 51 

30 | i | | | | 

TTGTACATCT TAACAACCTT AAGCTGTACA AATAGANCAA TAATATCTAA ATGGTGTGAT 60 

GATCAGCCCA CAGTACACAT CATTGATGAG AATTTCACTG GTCTCAACCT TTCTCATGCT 120 

GAGTCCTGGC TTTGTAAAAT GACTTATAAA GGTCCAAGGA TTTAGAGATG ATTAAGAGAT 180 

AAGCTGGCAT TCTGTAAAGG CACCATCGTC TATCCCCTGT CTTATCTAGA TAAAGAATGT 240 

35 AGTGCTAAAT CTTGTAATAA TATTGTACAA ATGGAAATTC AATCTTAAGG ATTATTTTTT 300 

CCATATTGTT GTATTTCATT GTGGTGTATT GGAAAGTGAT CTGGACTTTG AGTGAGAAGA 360 

TGTGATTTGG ACCATGGCAC TTAAAAACTC TATAACCTCA GGCAAGTCTT TTAATCTTCT 420 

CTGAGCC1CA GTTTTCCTCA TTTTTCAAAT ATAGAGAGTA TAACATTTAT CTCATAAGAC 480 

AAGTTGTAGT AAATTACTGT TTTACAAATG TAAGATAACT TTTAACTGTG AGATTCCATA 540 

40 TTCCAGTCTT ACATTAT TAT G TTTATCTGC CACAGGGAGA AGTCCTCAGA TAAAAATGTC 600 

TACCAAAAGA CTGACACGTG GAGTTAATCA TTTGACAGAT GCAAATGCTT CCACCCCCAA 660 

CAAATATACT TTCTTTAACT TCTGTGTGGG TATCACTTAG GGAAAAAAAG GCAGGCAACA 720 

AAATATTTTT TAATTCTATC TTAGGAAAAA TTGTAGNCAA ATCTTTTTNT CCCATTAACA 780 
AATAATGTAA GCCTTAATAT TCAAGGGGTA ATAAAAATAC AAAGTCTTCC AAACAGGTAA 840 
CTTACTTGAA AACTTT 



45 
50 
55 



SEQ ID NO:62 PDG3 Protean seouence 
Protein Accession f. AAB18375 

1 11 21 31 41 51 

I I I I I I 

KGARGAPSRR RQAGRRLRYL PTGSFPFLLL LLLLCIQLGG GQKKKENLLA EKVEQLMEWS 60 

SRRSIFRMNG DKFRKFIKAP PRNYSMIVMF TALQPQRQCS VCRQANEEYQ ILANSWRYSS 120 

AFCNKLFFSM VDYDEGTDVF QQLNMNSAPT FXHXPPKGRP KRADTFDLQR IGFAAEQLAK 180 

WIADRTDVHI RVFRPPNYSG TIALALLVSL VGGLLYXRRN NLEFIYNKtG WAMVSLCIVF 240 

AMTSGQMWNH IRGPPYAHKN PHNGQVSYIH GSSQAQFVAE SHIILVLNAA ITMGMVLLNE 300 
AATSKGDVGK RRIICLVGLG LWFFFSPLL SIFRSKYHGY PYSDLDFE 



SEO ID N0:B3 PDG8 DNA SEQUENCE 

60 Nucleic Add Accession #: AL080235 

Cooing sequence: 245-453 (underfined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CD GGTCGCCGCA CCGGCCGCCT CCGGCCCGCC GCCGCCCCCA GCGCCGCCGC CGCCACCGCC 60 

GGGGCGCCCA CCGCGCTGCC AGCCTACCCC GCGGCCGAGC CGCCCGGGCC GCTGTGGCTG 120 

CAGGGCGAGC CGCTGCATTT CTGCTGCCTA GACTTCAGCC TGGAGGAGCT GCAGGGCGAG 180 

CCGGGCTGGC GGCTGAACCG TAAGCCCATT GAGTCCACGC TGGTGGCCTG CTTCATGACC 240 

CTGGTCATCG TGGTGTGGAG CGTGGCCGCC CTCATCTGGC CGGTGCCCAT CATCGCCGGC 300 

70 TTCCTGCCCA ACGGCATGGA ACAGCGCCGG ACCACCGCCA GCACCACCGC AGCCACCCCC 360 

GCCGCAGTGC CCGCAGGGAC CACCGCAGCC GCCGCCGCCG CCGCCGCTGC CGCCGCCGCC 420 

GCGGCCGTCA CTTCGGGGGT GGCGACCAAG TGACCCGCTC CGCTCCTCCC TGTCTCCGTC 480 

CTGTGTCCGC " GCGCGCGGGT GCCTTTCCCG CCGGGGACTC GGCCGGTGTG CTTCGTGCTG 540 

_ _ TAGTTATCGT TAGTTCCTCT TCCCGAGATG GGGCCGCCGA GAGACCCCAG CGCCTTTGAA 600 

75 AAGCAAGGTT TGTGCTGCGC TTCCAGTTCC GAAAAGCAGA TGTTTAAGCC CTTGGACTGA 660 

GGGTGGGATC GCAGCTCCGA AGACGGAGAG GAGGGAAATG GGGCCCTTTC CCCTCTATTG 720 

CATCCCCCTG CCCGACTCCT TCCCCGCACC CACGTGCCCT AGATTCATGG CAGAAAATGA 780 

CCAAATCCTG TGTATTTGTT TTATATATTT AATAACTGTT TTAAATGAAA GTTTTAGTAA 840 

AAAAAATACA AAACAAAAAG ATTAAATTGC TATTGCTGTA GTAAGAGAAG CTCTTTGTAT 900 

80 CTGAACATAG TTGTATTTGA AATTTGTGGT TTTTTAATTT ATTTAAAATT GGGGGGAGGG 960 
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CATGGGAAGG ATTTAACACC GATATATTGT TACCGCTGAA AATGAACTTT ATGAACCTTT 1020 

TCCAAGTTGA TCTATCCAGT GACGTGGCCT GGTGGGCGTT TCTTCTTGTA CTTATGTGGT 1080 
TTTTTGGCTT TTAATACAGA CATTTTCCTC CAAAAAAAAA AAAAAAAAGG 

5 -SEQiDKO;64PDW Pwfeln SWW ncg 
Protein Accession #: CAB45781 

l 11 21 31 41 51 

in 1 1 1 1 1 1 

1U GRRTGRLRPA AAPSAAAATA GAPTALPAYP AAEPPGPLWL QGEPLHFCCL DFSLEELQGE 60 

PGWRLNRXPI ESTLVACFMT LVIWWSVAA LIWPVPIIAG FLPNGMEQRR TTASTTAATP 120 
AAVPAGTTAA AAAAAAA AAA AAVTSGVATK 

. _. SEQID NO:65 PDM1 DNA SEQUENCE 

15 Nucleic Acid Accession*: NM.006765 

Coding sequence: 149-1 1 95 (underlined sequences correspond to start and stop codoos) 

^ n . 1 U 21 31 41 51 

20 | | | | | | 

CGGCCGCGGC CCGGGTCCCT CGCAAAGCCG CTGCCATCCC GGAGGGCCCA GCCAGCGGGC 60 

TCCCGGAGGC TGGCCGGGCA GGCGTGGTGC GCGGTAGGAG CTGGGCGCGC ACGGCTACCG 120 

CGCGTGGAGG AGACACTGCC CTGCCGC GAT GG GGGCCCGG GGCGCTCCTT CACGCCGTAG 180 

GCAAGCGGGG CGGCGGCTGC GGTACCTGCC CACCGGGAGC TTTCCCTTCC TTCTCCTGCT 240 

25 GCTGCTGCTC TGCATCCAGC TCGGGGGAGG ACAGAAGAAA AAGGAGAATC TTTTAGCTGA 300 

AAAAGTAGAG CAGCTGATGG AATGGAGTTC CAGACGCTCA ATCTTCCGAA TGAATGGTGA 360 

TAAATTCCGA AAATTTATAA AGGCACCACC TCGAAACTAT TCCATGATTG TTATGTTCAC 420 

TGCTCTTCAG CCTCAGCGGC AGTGTTCTGT GTGCAGGCAA GCTAATGAAG AATATCAAAT 480 

ACTGGCGAAC TCCTGGCGCT ATTCATCTGC TTTTTGTAAC AAGCTCTTCT TCAGTATGGT 540 

30 GGACTATGAT GAGGGGACAG ACGTTTTTCA GCAGCTCAAC ATGAACTCTG CTCCTACATT 600 

CAYGCATTTW CCTCCAAAAG GCAGACCTAA GAGAGCTGAT ACTTTTGACC TCCAAAGAAT 660 

TGGATTTGCA GCTGAGCAAC TAGCAAAGTG GATTGCTGAC AGAACGGATG TTCATATTCG 720 

GGTTTTCAGA CCACCCAACT ACTCTGGTAC CATTGCTTTG GCCCTGTTAG TGTCGCTTGT 780 

TGGAGGTTTG CTTTATTNGA GAAGGAACAA CTTGGAGTTC ATCTATAACA AGACTGGTTG 840 

35 GGCCATGGTG TCTCTGTGTA TAGTCTTTGC TATGACTTCT GGCCAGATGT GGAACCATAT 900 

CCGTGGACCT CCATATGCTC ATAAGAACCC ACACAATGGA CAAGTGAGCT ACATTCATGG 960 

GAGCAGCCAG GCTCAGTTTG TGGCAGAATC ACACATTATT CTGGTACTGA ATGCCGCTAT 1020 

CACCATGGGG ATGGTTCTTC TAAATGAAGC AGCAACTTCG AAAGGCGATG TTCGAAAAAG 1080 

ACGGATAATT TGCCTAGTGG GATTGGGCCT GGTGGTCTTC TTCTTCAGTT TTCTACTTTC 1140 

40 AATATTTCGT TCCAAGTACC ACGGCTATCC TTATAGTGAT CTGGACTTTG AGTGAGAAGA 1200 

TGTGATTTGG ACCATGGCAC TTAAAAACTC TATAACCTCA GCTTTTTAAT TAAATGAAGC 1260 

CAAGTGGGAT TTGCATAAAG TGAATGTTTA CCATGAAGAT AAACTGTTCC TGACTTTATA 1320 

CTATTTTGAA TTCATTCATT TCATTGTGAT CAGCTAGCTT ATTCTTGTGT ACTTTTTTTA 1380 

AACTGTGGGT TTTCCTAGTA AATTTAATTT ACAGAAATCA ATGGTAGCAT TTAGTAATCT 1440 

45 ACAAAGGAAA TATCAAAGTG TTTTTCAAGC CTGTTATATY CAGTGTGTKC CACAGGATTG 1500 
CAATAAATGA CAATGTAATT A 



SEQ ID NO:66 PPM1 Prgfein sequence: 
50 Protein Accession #: NP_006756 

1 11 21 31 41 51 

I I I I I I 

MGARGAPSRR RQAGRRLRYL PTGSFPFLLL LLLLCIQLGG GQKKKENLLA EKVEQLMEWS 60 
55 SRRSIFRMNG DKFRKFIKAP PRNYSHIVMF TALQPQRQCS VCRQANEEYQ ILANSWRYSS 120 
AFCNKLFFSM VDYDEGTDVF QQLNMNSAPT FXHXPPKGRP KRADTFDLQR 1GFAAEQLAK 180 
WIADRTDVHI RVFRPPNYSG TIALALLVSL VGGLLYXRRN NLEFIYNKTG WAMVSLCIVF 240 
AMTSGGMWNH IRGPPYAHKN PHNGQVSY1H GSSQAQFVAE SHIILVLNAA ITKGMVLLNE 300 
AATSKGDVGK RRIICLVGLG LWFFFSFLL SIPRSKYHGY PYSDLDFE 



60 
65 



SEQ ID NO:67 P0M2 DNA SEQUENCE 

Nucleic Acid Accession*: NM_000947 

Coding sequence: 88-161 7 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I ( I I 1 I 

GGTTTCATAT GAACTCTCCC GCCACCCGGG AACAGCTGGC TGCCACCGTT TGTGTTTTCC 60 

70 GAGTTTGTAT TCTTGCAGGT GACCAAGATG GAGTTTTCTG GAAGAAAGCG GAGGAAGCTG 120 

AGGTTGGCAG GTGACCAGAG GAATGCTTCC TACCCTCATT GCCTTCAGTT TTACTTGCAG 180 

CCACCTTCTG AAAACATATC TTTAACAGAA TTTGAAAACT TGGCTATTGA TAGAGTTAAA 240 

TTGTTAAAAT CAGTTGAAAA TCTTGGAGTG AGCTATGTGA AAGGAACTGA ACAATACCAG 300 

AGTAAGTTGG AGAGTGAGCT TCGGAAGCTC AAGTTTTCCT ACAGAGAGAA GCTAGAAGAT 360 

75 GAATATGAAC CACGAAGAAG AGATCATATT TCTCATTTTA TTTTGCGGCT TGCTTATTGC 420 

CAGTCTGAAG AACTTAGACG CTGGTTCATT CAACAAGAAA TGGATCTCCT TCGATTTAGA 480 

TTTAGTATTT TACCCAAGGA TAAAATTCAG GATTTCTTAA AGGATAGCCA ATTGCAGTTT 540 

GAGGCTATAA GTGATGAAGA GAAGACTCTT CGAGAACAGG AGATTGTTGC CTCATCACCA 600 

AGTTTAAGTG GACTTAAGTT GGGGTTCGAG TCCATTTATA AGATCCCTTT TGCTGATGCT 660 

80 CTGGATTTGT TTCGAGGAAG GAAAGTCTAT TTGGAAGATG GCTTTGCTTA CGTACCACTT 720 

326 



WO 02/30268 



AAGGACATTG TGGCAATCAT CCTGAATGAA TTTAGAGCCA AACTGTCCAA GGCTTTGGCA 780 

TTAACAGCCA GGTCCTTGCC TGCTGTGCAG TCTGATGAAA GACTTCAGCC TCTGCTCAAT 840 

CACCTCAGTC ATTCCTACAC TGGCCAAGAT TACAGTACCC AGGGAAATGT TGGGAAGATT 900 

TCTTTAGATC AGATTGATTT GCTTTCTACC AAATCCTTCC CACCTTGCAT GCGTCAGTTA 960 

CATAAAGCCT TGCGGGAAAA TCACCATCTT CGTCATGGAG GCCGAATGCA GTATGGCCTA 1020 

TTTCTGAAGG GCATTGGTTT AACTTTGGAA CAGGCATTGC AGTTCTGGAA GCAAGAATTT 1080 

ATCAAAGGAA AGATGGATCC AGACAAGTTT GATAAAGGTT ACTCTTACAA CATCCGTCAC 1140 

AGCTTTCGAA AGGAAGGCAA GAGGACAGAC TATACACCTT TCAGTTGCCT GAAGATTATT 1200 

CTGTCCAATC CACCAAGCCA AGGGGATTAT CATGGGTGCC CATTCCGTCA CAGTGATCCA 1260 

GAGCTGCTGA AGCAAAAGTT GCAGTCATAC AAGATCTCTC CTGGAGGGAT AAGCCAGATT 1320 

TTGGATTTAG TAAAGGGGAC ACATTACCAG GTAGCCTCTC AAAAATACTT TGAGATGATA 1380 

CACAATGTGG ATGATTGTGG CTTTTCTTTG AATCATCCTA ATCAGTTCTT TTGTGAGAGC 1440 

CAACGTATTC TAAATGGTGG TAAAGACATA AAGAAGGAAC CTATCCAACC AGAAACTCCT 1500 

CAACCCAAAC CAAGTGTCCA GAAAACCAAG GATGCATCAT CTGCTCTGGC CTCTTTAAAT 1560 

TCCTCTCTGG AAATGGATAT GGAAGGACTA GAAGATTACT TTAGTGAAGA TTCTTAGGCA 1620 

GTTTTATAAC CCTTTTTCCT CAATAGCCTG TTTCCTGTTT TTAAGATTTT GCCTTTGTTG 1680 

TTGAAAAAGG GTTTCACTGT CACCAAGGCT TAGTGCAGTG ACACAATTAC AGCTGATTGC 1740 

AGCCTTGACC TTCCCAGCTC AAGTGATCCT CCTACCTCAG CCTCCCAAGT AGTTAGGACA 1800 

CACAGGTGTG CACCTCATAT CCAGATAATT TTTTTCAATT TTTTTTTGTA GAGGTGGGGG 1860 

GTCTCCCTAT GTTGCCCAGG CAGATCTCAG ACTCCTGGGC TCAAGCGATC CTCACACCTC 1920 

AGCGTCCCAG AGTGCTGGGA TTACAGTTGT GAGCCACTGT GCCTGGCCTT TTTTTTTTTT 1980 

TAACcrrrrc gtttaacttc tctcttcact gcatcccaat ccatctacag gcatgcacac 2040 

TTATTAGGAA AGGAGGTTTG AGGTAACAAC AGAGACTTTC ACTATATTTT GCTTTGACAG 2100 

aaggaaagag GAGGAGTTTC TATTAAAATC TGTCACTTGA GTGATGTCAT TTAAGTCCTA 2160 

TTTTAGGAGA TAAAAACAGC TTTGGGGACT GGTTAAAGTC CCCCAGAAAC TACAATAAAG 2220 

AACAACTTTT GTTTTAACTC TTAATCACTT TGTAATTTTG ACTCAATCCT TTTCTGGACC 2280 
ATTTTTGTTA ATAAATATCA AAGTGT 



SEQ ID N0:6B P0M2 Protein sequence: 
Protein Accesston #: NP_000938 

1 n 21 31 41 51 

I I I I I 1 

MEFSGRKRRK LRLAGDQRNA SYPHCLQFYL QPPSENISLT EFENLAIDRV KLLKSVENLG 60 

VSYVKGTEQY QSKLESELRK LKFSYREKLE DEYEPRRRDH ISHFILRLAY CQSEELRRWF 120 

IQQEMDLLRF RFSI LPKOKI QDFLKDSQLQ FEAISDEEKT LREQEIVASS PSLSGLKLGF 180 

ESIYKIPFAD ALDLFRGRKV YLEDGFAYVP LKDIVAIILN EFRAKLSKAL ALTAR5LPAV 240 

OSDERLQPLL NHLSHSYTGQ DYSTQGNVGK ISLDQIDLLS TKSFPPCMRQ LHKALRENHH 300 

LRHGGRMQYG LFLKGIGLTL EQALQFWKQE FIKGKMDPDK FDKGYSYNIR HSFGKEGKRT 360 

DYTPFSCLKI ILSNPPSQGD YHGCPFRHSD PELLKQKLQS YKISPGGISQ I LDLVKGTHY 420 

QVACQKYFEM IHNVDDCGFS LNHPNQFFCE SQRILNGGKD IKKEPIQPET PQPKPSVQKT 480 
KD ASS ALAS L NSSLEMDMEG LEDYFSEDS 

SEQ ID NO:69 PDM3 DNA SEQUENCE 

Nucleic Add Accession* NM.024840 

Cooing sequence: 108491 (underlined sequences correspond to start and stop cottons) 

1 11 21 31 41 51 

I I I I I I 

AATTCATACA GGAGAGAAGT CATATATATG CAGTGATTGT GGAAAAGGCT TCATCAAGAA 60 

GTCTCGGCTC ATTAATCATC AGAGAGTTCA TACAGGAGAG AAACCACATG GATGCAGCCT 120 

GTGTGGGAAG GCCTTCTCCA AAAGGTCCAG GCTCACTGAA CACCAGAGAA CTCATACAGG 180 

AGAGAAGCCC TATGAATGCA CTGAATGTGA CAAAGCATTC CGCTGGAAAT CACAGCTCAA 240 

TGCACATCAG AAAGCTCACA CAGGAGAGAA GTCATATATA TGCCGTGATT GTGGAAAAGG 300 

CTTCATTCAG AAGGGAAATC TCATTGTACA TCAGCGAATT CATACTGGAG AAAAACCCTA 360 

TATATGCAAT GAATGTGGAA AAGGCTTCAT CCAAAAGGGC AACCTCCTTA TTCATCGACG 420 

TACTCACACT GGAGAGAAAC CCTATGAATG CAATGAATGT GGGAAAGGCT TCAGCCAGAA 480 

GACATGT TTA A TATCCCATC AGAGATTTCA CACAGGAAAG ACACCCTTTG TATGTACTGA 540 

GTGTGGAAAA TCCTGCTCAC ACAAGTCAGG TCTCATTAAC CACCAGAGAA TTCACACAGG 600 

AGAGAAACCC TATACATGCA GTGACTGTGG GAAAGCTTTC AGAGATAAAT CATGTCTCAA 660 

CAGACATCGG AGAACTCATA CAGGGGAGAG ACCGTATGGA TGCTCTGATT GTGGGAAAGC 720 

TTTCTCCCAC TTGTCATGCC TTGTTTATCA TAAGGGAATG CTGCATGCAA GAGAGAAATG 780 

TGTAGGTTCA GTCAAATTGG AAAATCCTTG CTCAGAGAGT CATAGCTTAT CACATACACG 840 

TGATCTCATA CAGGATAAAG ACTCTGTTAA CATGGTGACT CTGCAGATGC CTTCTGTGGC 900 

AGCTCAGACC TCATTAACTA ACAGTGCGTT CCAAGCAGAG AGCAAAGTAG CCATTGTGAG 960 

CCAGCCTGTT GCCAGAAGTT CAGTCTCAGC AGATAGTAGA ATTTGCACAG AATAAAAACC 1020 

ATATGAATGC AGTGAATGTG GTAGTGCTTT CAGTGATCAA TTACATCATA TGTCACAAAA 1080 

AACACAGAGG AACAAACTGA TATATTCAAG GTGGAAAGCC CTTGAATAAA ACCTTATGGC 1140 

TAATAAGCAT ATACTCAGAG AAAAATAGTA TGAAGTGGAG ACTGGGAAAT TCTTTTATGG 1200 

GAAGATAGAT CTTCTCATCA GTGACCATAG ATCACATCTT CAGTGAGCTT ATAGTTGGTA 1260 

GAAATATAAT GATCATGGAA AAGTCCTTGT TCAGAAACAG TACGCCAGTA GGTATCAGGG 1320 

GGTTTACACA GGAGAGAAAC TTTTGGAAGA CCTTTGAAGG CTATGAATGT GGCAGGGTTG 1380 

CTAGTGGTAC ATTCTGCCTT ATCCTCAGAG GGAATCATAT AGAAATAAAA CTATGAAAAT 1440 

GTAACTAGAA CATCTTCATC AAAATATGAA AGAACACACG AAGCAAATAA GCCCTGTGAA 1500 

AAGGAGTATT TTAGAGATTT CGATCAGAAA TCTAACATCA TTATATGGCA GATAATATAC 1560 

AGGATGTGTA TTTTAGGACA ATATACCTTG AATCACTAGT TGATATGTCA ATGACTAATT 1620 

AAAAGGGGTT GTCAGTGTTA CACATCATTG GTTAAATTTA TAGCACAATG TACCTCTTCC 1680 

CCCTTTTTTG ATAAGAGTCT TCTATTCCCA ACCAAGATCA TTATATGATT AGCTCTTGTG 1740 

TTTCTTTGAT TCCAAATTTC TTCACTTGTT ATTTCAGACT ACTGAAGCTC TTCAAAAGGA 1800 
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AAAATGTATT TAATTTAATA ATGTAACACA ACAAGTTTGG ATGTGTTTAA CTTTATAAAT 1860 
AATCACCCCA GAGGAATGAA GTTCAAAACT TGTGAATAAC C 

Protein Accession!: NP_079H6 

1 11 21 31 41 51 

1U MDAACVGRPS PKGPGSLNTR ELIQERSPMN ALNVTKHSAG NHSSMHIRKL TQERSHIYAV 60 
IVEKASFRRE ISLYISEFIL EKNPIYAMNV EKASSKRATS LFIDVLTLER NPHNAHNVGK 120 
ASARRHV 

. e SEO ID NO:71 PDM3 DNA SEQUENCE 

15 Nudeic Acid Accession* NMJ) 18455 

Coding sequence: 341-955 (underlined sequences correspond lo start and slop codons) 



20 
25 
30 
35 
40 



1 11-21 31 41 51 

I I I I I 1 

AATTTCGGCA CGGGGGGGAG GCACAGTGAG TCCACTGGGG CACGGCAGCG TCTAAGCCAC 60 

AAGCCGACTG ACATAAGCCA GGTCCTAACG GAGCCTATGT GTAAGTCCAC TACTGGTGCA 120 

AGGTTGCACA CTTCTAAGAA GAGCGGCGTG GGGGGCTCGG CGACCTTCGC TTCAGTCGCT 180 

CCCCCGTGCA GTCCCCTGTG CCCAAGACAC AGCCTGATGC TTGTGCTCCG GTGGGCGGAC 240 

TTGGAGGCGG CGGGAACTGC AATTGGTGGC TTTGAAGGGC GGCGAGCGGG AACAGCTCTT 300 

GAGGAGTGAG ACTGCAGGAG ATGTGGGCCG TGCCAAAGAG ATGGATGAGA CTGTTGCTGA 360 

GTTCATCAAG AGGACCATCT TGAAAATCCC CATGAATGAA CTGACAACAA TCCTGAAGGC 420 

CTGGGATTTT TTGTCTGAAA ATCAACTGCA GACTGTAAAT TTCCGACAGA GAAAGGAATC 480 

TGTAGTTCAG CACTTGATCC ATCTGTGTGA GGAAAAGCGT GCAAGTATCA GTGATGCTGC 540 

CCTGTTAGAC ATCATTTATA TGCAATTTCA TCAGCACCAG AAAGTTTGGG ATGTTTTTCA 600 

GATGAGTAAA GGACCAGGTG AAGATGTTGA CCTTTTTGAT ATGAAACAAT TTAAAAATTC 660 

GTTCAAGAAA ATTCTTCAGA GAGCATTAAA AAATGTGACA GTCAGCTTCA GAGAAACTGA 720 

GGAGAATGCA GTCTGGATTC GAATTGCCTG GGGAACACAG TACACAAAGC CAAACCAGTA 780 

CAAACCTACC TACGTGGTGT ACTACTCCCA GACTCCGTAC GCCTTCACGT CCTCCTCCAT 840 

GCTGAGGCGC AATACACCGC TTCTGGGTCA GGAGTTAGAA GCTACTGGGA AAATCTACCT 900 

CCGACAAGAG GAGATCATTT TAGATATTAC CGAAATGAAG AAAGCTTGCA ATTAGTGAAC 960 
ATGAAAGGAA AATAAAAATT CCTCACAGTC AAAAAAAAAA AAAAA 

SEQ ID NO:72 PDM8 Protein sequence: 
Protein Accession #: NP_060925 



1 11 21 31 41 51 . 

I I 1 I I I 

MDETVAEFIK RTILK1PMNE LTTILKAWDF LSENQLQTVN FRQRKESWQ HLIHLCEEKR 60 

ASISDAALLD IIYMQFHQHQ KVWDVFQMSK GPGEDVDLFD MKQFKNSFKK ILQRALKNVT 120 

45 VSPRETEENA VW1RIAWGTQ YTKPNQYXPT YWYYSQTPY AFTSSSMLRR NTPLLGQELE 180 
ATGKIYLRQE EIILDITEMK KACN 

SEQ ID NO:73 PDM9 DNA SEQUENCE 

Nucleic Add Accession* NMJH6192 
50 Cooing sequence: 1-1 125 (underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGTGCTGT GGGAGTCCCC GCGGCAGTGC AGCAGCTGGA CACTTTGCGA GGGCTTTTGC 60 

55 TGGCTGCTGC TGCTGCCCGT CATGCTACTC ATCGTAGCCC GCCCGGTGAA GCTCGCTGCT 120 

TTCCCTACCT CCTTAAGTGA CTGCCAAACG CCCACCGGCT GGAATTGCTC TGGTTATGAT 180 

GACAGAGAAA ATGATCTCTT CCTCTGTGAC ACCAACACCT GTAAATTTGA TGGGGAATGT 240 

TTAAGAATTG GAGACACTGT GACTTGCGTC TGTCAGTTCA AGTGCAACAA TGACTATGTG 300 

, CCTGTGTGTG GCTCCAATGG GGAGAGCTAC CAGAATGAGT GTTACCTGCG ACAGGCTGCA 360 

OO TGCAAACAGC AGAGTGAGAT ACTTGTGGTG TCAGAAGGAT CATGTGCCAC AGATGCAGGA 420 

TCAGGATCTG GAGATGGAGT CCATGAAGGC TCTGGAGAAA CTAGTCAAAA GGAGACATCC 480 

ACCTGTGATA TTTGCCAGTT TGGTGCAGAA TGTGACGAAG ATGCCGAGGA TGTCTGGTGT 540 

GTGTGTAATA TTGACTGTTC TCAAACCAAC TTCAATCCCC TCTGCGCTTC TGATGGGAAA 600 

<c TCTTATGATA ATGCATGCCA AATCAAAGAA GCATCGTGTC AGAAACAGGA GAAAATTGAA 660 

05 GTCATGTCTT TGGGTCGATG TCAAGATAAC ACAACTACAA CTACTAAGTC TGAAGATGGG 720 

CATTATGCAA GAACAGATTA TGCAGAGAAT GCTAACAAAT TAGAAGAAAG TGCCAGAGAA 780 

CACCACATAC CTTGTCCGGA ACATTACAAT GGCTTCTGCA TGCATGGGAA GTGTGAGCAT 840 

TCTATCAATA TGCAGGAGCC ATCTTGCAGG TGTGATGCTG GTTATACTGG ACAACACTGT 900 

_ A GAAAAAAAGG ACTACAGTGT TCTATACGTT GTTCCCGGTC CTGTACGATT TCAGTATGTC 960 

/U TTAATCGCAG CTGTGATTGG AACAATTCAG ATTGCTGTCA TCTGTGTGGT GGTCCTCTGC 1020 

ATCACAAGGA AATGCCCCAG AAGCAACAGA ATTCACAGAC AGAAGCAAAA TACAGGGCAC 1080 
TACAGTTCAG ACAATACAAC AAGAGCGTCC ACGAGGTTAA TCTGA 
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SEQ ID NO:74 PDM9 Protein sequence: 
Protein Accession!: NP.057276 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



i 
I 

1 MVLWESPRQC 
61 D REND LF LCD 
121 CKQQSEILW 
181 VCNIDCSQTN 
241 HYARTDYAEN 
301 EKKDYSVLYV 
361 YSSDNTTRAS 



11 
I 

SSWTLCEGFC 
TNTCKFDGEC 
SEGSCATDAG 
FNPLCASDGK 
ANKLEESARE 
VPGPVRFQYV 
TRLI 



21 
I 

WLLLLPVMLL 
LRIGDTVTCV 
SGSGDGVHEG 
SYDNACQIKE 
HHIPCPEHYN 
LIAAVIGT1Q 



31 
I 

IVARPVKLAA 
CQFKCNNDYV 
SGETSQKETS 
ASCOKOEKIE 
GFCMHGKCEH 
IAVICWV1C 



41 
I 

FPTSLSDCQT 
PVCGSNGESY 
TCDICQFGAE 
VMSLGRCQDN 
SINMQEPSCR 
ITRKCPRSNR 



51 
I 

PTGWNCSGYD 
QNECYLRQAA 
CDEDAEDVWC 
TTTTTKSEDG 
CDAGYTGQHC 
IHRQKQNTGH 



SEQ ID NO:75 PD01 DNA SEQUENCE 

Nucleic Acid Accession!: NM_0 14324 

Coding sequence: 89-1237 (underlined sequences correspond to start and stop codons) 



GGCGCCGGGA 
TTCCTTCAGC 
GTCCGGCCTG 
GGTACGCGTG 
CTCGCTAGTG 
GGTCGGATGT 
CAGAGATTCT 
AGTTCAGGAA 
TGTTCTCTCA 
TGACTTTGCT 
CACACGCACT 
AAGTTCTTTT 
CATGTTGGAT 

GTCTGATGAA 
TGCAGATGTA 
TGCCTGTGTG 
ACGGGGCTCG 
GTTAAACACC 
GGAGATACTT 
AATCATTGAA 
AATTTGAATA 
GAGGAACAGT 
CTACAGTGAT 
TGGGTACTTA 
TGATATTAAG 
TCTTGAAGAC 
AAATGCCACA 
GGCCTTTTGT 
TATCACACTT 
CTGAAAAAAA 
GGGACAGTCA 
CTCTGGGCTG 
TTCTGGATCT 
AAAAAAAAAA 



11 

I 

TTGGGAGGGC 
GGGGCACTGG 
GCCCCGGGCC 
GACCGGCCCG 
CTGGACCTGA 
GCTGCTGGAG 
GCAGCGGGAA 
AGCTTCTGCC 
AAAATTGGCA 
GGTGGTGGCC 
GACAAGGGTC 
CTGTGGAAAA 
GGTGGAGCAC 
GCAATAGAAC 
CTTCCCAATC 
TTTGCAAAGA 
ACTCCGGTTC 
TTTATCACCA 
CCAGCCATCC 
GAAGAATTTG 
AGTAATAAGG 
CTGCATTTAC 
ATTACAGTGT 
GATTGAATTC 
TACTAAATTA 
ATTCTTGACT 
ATCGATATAC 
AATTGTATGG 
CTTGGTGTTC 
TGTAATTTGC 
CATATCCAAA 
GTTTTAGGGT 
TCAGCTTTCC 
TATACCCAAC 
AAAAAAAAAA 



21 
I 

TTCTTGCAGG 
GAAGCGCCAT 
GTNTCTGTGC 
GCTCCCGCTA 
AGCAGCCGCG 
CCCTTCCGCC 
AATCCAAGGC 
GGTTAGCTGG 
GAAGTGGTGA 
TTATGTGTGC 
AGGTCATTGA 
CTCAGAAATC 
CTTTCTATAC 
CCCAGTTCTA 
AGATGAGCAC 
AGACGAAGGC 
TGACTTTTGA 
GTGAGGAGCA 
CTTCTTCCAA 
GATTCAGCCG 
TAAAAGCTAG 
AGTGTAGAGT 
CCTACCACTC 
TAAAAATGGT 
TGGTAGTTAT 
TATATTTTGA 
ATTTATTTAC 
TGATAAAAGT 
ATGATCTCCC 
AAAGAAAAGT 
ATAATGAGGA 
TGCCTGTATC 
TTTCTCCATG 
ACACAGCAAC 
AAAAAAAA 



31 
I 

CTGCTGGGCT 
GGCACTGCAG 
TATGGTCCTG 
CGACGTGAGC 
GGAGCCGCGT 
GCGGTGTCAT 
TTATTTATGC 
CCACGATATC 
GAATCCGTAT 
ACTGGGCATT 
TGCAAATATG 
GAGTCTGTGG 
GACTTACAGG 
CGAGCTGCTG 
GGATGATTGG 
AGAGTGGTGT 
GGAGGTTGTT 
GGACGTGAGC 
AGGGGATCCT 
AGAAGAGATT 
TCTCTAACTT 
AACACATAAC 
TAATCAAGAA 
TATCATTAGG 
TCTGCCTTCC 
ATGG GTTCTA 
ACTCTTGATT 
CACGTGAAAC 
TCTAAGCACA 
TTCACCTGTA 
AATGTGTTGG 
CAGTAACTCG 
TGTTTGATTT 
ATCCAGAAAT 



41 

I • 

GGGGCTAAGG 
GGCATCTCGG 
GCTGACTTCG 
CGCTTGGGCC 
GCTGCGGCGT 
GGAGAAACTC 
CAGGCTGAGT 
AACTATTTGG 
GCCCCGCTGA 
ATAATGGCTC 
GTGGAAGGAA 
GAAGCACCTC 
ACAGCAGATG 
ATCAAAGGAC 
CCAGAAATGA 
CAAATCTTTG 
CATCATGATC 
CCCCGCCTTG 
TTCATAGGAG 
TATCAGCTTA 
CCAGGCCCAC 
ATTGTATGCA 
AAGAATTACA 
GCTTTTGATT 
AGTTTGCTTG 
GTGAAAAAGG 
CTACAATGTA 
AGAGTGATTG 
TTCCAAACTT 
TTGAATCAGA 
CTCACTACGT 
GGGCCTGTTT 
CTCCTCAGGC 
AAAGATCTCA 




GGGGCAAGCG 
CTGTGCAAGC 
CAGCTGGGCC 
GGATTTGGCC 
CTTTGTCAGG 
ATCTCGTGGC 
TTTTTGACCG 
CAGCATATTT 
GAGGACAGAA 
GGGAATTCAT 
TTGGACTAAA 
AGAAGAAGTT 
ACGGCACAGA 
ACAACAAGGA 
CACCTCTGCT 
AACACACTGA 
ACTCAGATAA 
GGCTCAAGTG 
TGGAAACATG 
GACTCTGATT 
TATAAAACTT 
ATATATTTGT 
AATGATATAT 
GAAAATGAGG 
GTTGCATCCA 
TAGCAACAGT 
ATGCCTTCAA 
AGAGTCCAGA 
CCCCGTGGGT 
TGGTAGCAAG 
GGACCCCCCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



SEQ ID NO:76 PDOI Protein sequence: 
Protein Accession t: NP_055139 

1 11 21 31 • 41 51 

I I I I I ! 

1 MALQGISWE LSGLAPGRXC AMVLADFGAR WRVDRPGSR YDVSRLGRGK RSLVLDLKQP 
61 REPRAAASVQ AVGCAAGALP PRCHGETPAG PRDSAAGKSK AYLCQAEWIW PVQESFCRLA 
121 GHDINYLALS GVLSKIGRSG ENPYAPLNLV ADFAGGGLMC ALG1IKALFD RTRTDKGQVI 
181 DANKVEGTAY LSSFLWKTQK SSLWEAPRGQ NMLDGGAPFY TTYRTADGEF HAVGAIEPQF 
241 YELLIKGLGL KSDELPNQMS TDDWPEMKKK FADVFAKKTK AEWCQIFDGT DACVTPVLTP 
301 EEWHHDHNK ERGSFITSEE QDVSPRLAPL LLNTPAIPSS KGDPFIGEHT EEILEEFGFS 
361 REEIYQLNSD KIXESNKVKA SL 

SEQ ID NO:77 PD03 DNA SEQUENCE 

Nucleic Add Accession #: AB028951 

Coding sequence: 97-1 128 (undertned sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GTTAAATCCT TACTTTACCA GATTCTTGAT GGTATCCATT ACCTCCATGC AAATTGGGTG 60 

CTTCACAGAG ACTTGAAACC AGCAAATATC CTAGTAATGG GAGAAGGTCC TGAGAGGGGG 120 

AGAGTCAAAA TAGCTGACAT GGGTTTTGCC AGATTATTCA ATTCTCCTCT AAAGCCACTA 180 

GCAGATTTGG ATCCAGTAGT TGTGACATTT TGGTATCGGG CTCCAGAACT TTTGCTTGGT 240 

GCAAGGCATT ATACAAAGGC CATTGATATA TGGGCAATAG GTTGTATATT TGCTGAATTG 300 

TTGACTTCGG AACCTATTTT TCACTGTCGT CAGGAAGATA TAAAAACAAG CAATCCCTTT 360 

329 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 



WO 02/30268 
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CATCATGATC AACTGGATCG GATATTTAGT GTCATGGGGT TTCCTGCAGA TAAAGACTGG 420 
GAAGATATTA GAAAGATGCC AGAATATCCC ACACTTCAAA AAGACTTTAG AAGAACAACG 480 
TATGCCAACA GTAGCCTCAT AAAGTACATG GAGAAACACA AGGTCAAGCC TGACAGCAAA 540 
GTGTTCCTCT TGCTTCAGAA ACTCCTGACC ATGGATCCAA CCAAGAGAAT TACCTCGGAG 600 
5 CAAGCTCTGC AGGATCCCTA TTTTCAGGAG GACCCTTTGC CAACATTAGA TGTATTTGCC 660 
GGCTGCCAGA TTCCATACCC CAAACGAGAA TTCCTTAATG AAGATGATCC TGAAGAAAAA 720 
GGTGACAAGA ATCAGCAACA GCAGCAGAAC CAGCATCAGC AGCCCACAGC CCCTCCACAG 780 
CAGGCAGCAG CCCCTCCACA GGCGCCCCCA CCACAGCAGA ACAGCACCCA GACCAACGGG 840 
ACCGCAGGTG CGGCTGGGGC CGGGGTCGGG GGCACCGGAG CAGGGTTGCA GCACAGCCAG 900 

10 GACTCCAGCC TGAACCAGGT GCCTCCAAAC AAGAAGCCAC GGCTAGGGCC TTCAGGCGCA 960 
AACTCAGGTG GACCTGTGAT GCCCTCGGAT TATCAGCACT CCAGTTCTCG CCTGAATTAC 1020 
CAAAGCAGCG TTCAGGGATC CTCTCAGTCC CAGAGCACAC TTGGCTACTC TTCCTCGTCT 1080 
CAGCAGAGCT CACAGTACCA CCCATCTCAC CAGGCCCACC GGTACTGACC AGCTCCCGTT 1140 
GGGCCAGGCC AGCCCAGCCC AGAGCACAGG CTCCAGCAAT ATGTCTGCAT TGAAAAGAAC 1200 

15 CAAAAAAATG CAAACTATGA TGCCATTTAA AACTCATACA CATGGGAGGA AAACCTTATA 1260 
TACTGAGCAT TGTGCAGGAC TGATAGCTCT TCTTTATTGA CTTAAAGAAG ATTCTTGTGA 1320 
AGTTTCCCCA GCACCCCTTC CCTGCATGTG TTCCATTGTG ACTTCTCTGA TAAAGCGTCT 1380 
GATCTAATCC CAGCACTTCT GTAACCTTCA GCATTTCTTT GAAGGATTTC CTGGTGCACC 1440 
. TTTCTCATGC TGTAGCAATC ACTATGGTTT ATCTTTTCAA AGCTCTTTTA ATAGGATTTT 1500 

20 AATGTTTTAG AAACAGGATT CCAGTGGTGT ATAGTTTTAT ACTTCATGAA CTGATTTAGC 1560 
AACACAGGTA AAAATGCACC TTTTAAAGCA CTACGTTTTC ACAGACAATA ACTGTTCTGC 1620 
TCATGGAAGT cttaaacaga AACTGTTACT GTCCCAAAGT ACTTTACTAT TACGTTCGTA 1680 
TTTATCTAGT TTCAGGGAAG GTCTAATAAA AAGACAAGCG GTGGGACAGA GGGAACCTAC 1740 
AACCAAAAAC TGCCTAGATC TTTGCAGTTA TGTGCTTTAT GCCACGAAGA ACTGAAGTAT 1800 

25 GTGGTAATTT TTATAGAATC ATTCATATGG AACTGAGTTC CCAGCATCAT CTTATTCTGA 1860 
ATAGCATTCA GTAATTAAGA ATTACAATTT TAACCTTCAT GTAGCTAAGT CTACCTTAAA 1920 
AAGGGTTTCA AGAGCTTTGT ACAGTCTCGA TGGCCCACAC CAAAACGCTG AAGAGAGTAA 1980 
CAACTGCACT AGGATTTCTG TAAGGAGTAA TTTTGATCAA AAGACGTGTT ACTTCCCTTT 2040 
GAAGGAAAAG TTTTTAGTGT GTATTGTACA TAAAGTCGGC TTCTCTAAAG AACCATTGGT 2100 

30 TTCTTCACAT CTGGGTCTGC GTGAGTAACT TTCTTGCATA ATCAAGGTTA CTCAAGTAGA 2160 
AGCCTGAAAA TTAATCTGCT TTTAAAATAA AGAGCAGTGT TCTCCATTCG TATTTGTATT 2220 
AGATATAGAG TGACTATTTT TAAAGCATGT TAAAAATTTA GGTTTTATTC ATGTTTAAAG 2280 
TATGTATTAT GTATGCATAA TTTTGCTGTT GTTACTGAAA CTTAATTCTA TCAAGAATCT 2340 
TTTTCATTGC ACTGAATGAT TTCTTTTGCC CCTAGGAGAA AACTTAATAA TTGTGCCTAA 2400 

35 AAACTATGGG CGGATAGTAT AAGACTATAC TAGACAAAGT GAATATTTGC ATTTCCATTA 2460 
TCTATGAATT AGTGGCTGAG TTCTTTCTTA GCTGCTTTAA GGAGCCCCTC ACTCCCCAGA 2520 
GTCAAAAGGA AATGTAAAAA CTTAGAGCTC CCATTGTAAT GTAAGGGGCA AGAAATTTGT 2580 
GTTCTTCTGA ATGCTACTAG CAGCACCAGC CTTGTTTTAA ATGTTTTCTT GAGCTAGAAG 2640 
AAATAGCTGA TTATTGTATA TGCAAATTAC ATGCATTTTT AAAAACTATT CTTTCTGAAC 2700 

40 TTATCTACCT GGTTATGATA CTGTGGGTCC ATACACAAGT AAAATAAGAT TAGACAGAAG 2760 
CCAGTATACA TTTTGCACTA TTGATGTGAT ACTGTAGCCA GCCAGGACCT TACTGATCTC 2820 
AGCATAATAA TGCTCACTAA TAATGAAGTC TGCATAGTGA CACTCATCAA GACTGAAGAT 2880 
GAAGCAGGTT ACGTGCTCCA TTGGAAGGAG TTTCTGATAG TCTCCTGCTG TTTTACCCCT 2940 
TCCATTTTTT AAAATAAGAA ATTAGCAGCC CTCTGCATAA TGTAGCTGCC TATATGCAGT 3000 

45 TTTATCCTGT. GCCCTAAAGC CTCACTGTCC AGAGCTGTTG GTCATCAGAT GCTTATTGCA 3060 
CCCTCACCAT GTGCCTGGTG CCCTGCTGGG TAGAGAACAC AGAGGACAGG GCATACTTCT 3120 
TGTCCTTAAG GAGCTTGTGA TCTGTGACAG TAAGCCCTCC TGGGATGTCT GTGCCATGTG 3180 
ATTGACTTAC AAGTGAAACT GTCTTATAAT ATGAAGGTCT TTTTGTTTAC TTCTAAACCC 3240 
ACTTGGGTAG TTACTATCCC CAAATCTGTT CTGTAAATAA TATTATGGAA GGGTTTCTAT 3300 

50 GTCAGTCTAC CTTAGAGAAA GCCAGTGATT CAATATCACA AAAGGCATTG ACGTATCTTT 3360 
GAAATGTTCA CAGCAGCCTT TTAACAACAA CTGGGTGGTC CTTGTAGGCA GAACATACTC 3420 
TCCTAAGTGG TTGTAGGAAA TTGCAAGGAA AATAGAAGGT CTGTTCTTGC TCTCAAGGAG 3480 
GTTACCTTTA ATAAAAGAAG ACAAACCCAG ATAGATATGT AAACCAAAAT ACTATGCCCC 3540 
TTAATACTTT ATAAGCAGCA TTGTTAAATA GTTCTTACGC TTATACATTC ACAGAACTAC 3600 

55 CCTGTTTTCC TTGTATATAA TGACTTTTGC TGGCAGAACT GAAATATAAA CTGTAAGGGG 3660 
ATTTCGTCAG TTGCTCCCAG TATACAATAT CCTCCAGGAC ATAGCCAGAA ATCTCCATTC 3720 
CACACATGAC TGAGTTCCTA TCCCTGCACT GGTACTGGCT CTTTTCTCCT CTTTCCTTGC 3780 
CTCAGGGTTC GTGCTACCCA CTGATTCCCT TTACCCTTAG TAATAATTTT GGATCATTTT 3840 
CTTTCCTTTA AAGGGGAACA AAGCCTTTTT TTTTTTTGAG ACGGAGTGTT GCTCTGTCAC 3900 

60 CCAAGCTGGA GTGCAGTGGC ACGATCTTGG CTCACTCCAA CCTCCACCTT CCAGGTTCAA 3960 
GTGATTCTCC TGCCTCAGCC TCCCGAGTAG CTGGGACTAC GGGCACGCAC CACCACGTCT 4020 
GGCTAATTTT TGTATTTTTA GTAGAGATGG GGTTTCACCC TATTGGTCAG GCTGGTCTTG 4080 
AATTCCTCAC CTCAGGTCAT CCGCCTGTCT CGGCCTCCCG AAGTGCTGGG ATTATAGGTG 4140 

r TGAGCCACCG CACCCAGTTG GGAACAAAGC CTTTTTAACA CACGTAAGGG CCCTCAAACC 4200 

65 GTGGGACCTC TAAGGAGACC TTTGAAGCTT TTTGAGGGCA AACTTTACCT TTGTGGTCCC 4260 

CAAATGATGG CATTTCTCTT TGAAATTTAT TAGATACTGT TATCTCCCCC AAGGGTACAG 4320 
GAGGGGCATC CCTCAGCCTA TGGGAACACC CAAACTAGGA GGGGTTATTG ACAGGAAGGA 4380 
ATGAATCCAA GTGAAGGCTP TCTGCTCTTC GTGTTACAAA CCAGTTTCAG AGTTAGCTTT 4440 

_^ CTGGGGAGGT GTGTGTTTGT GAAAGGAATT CAAGTGTTGC AGGACAGATG AGCTCAAGGT 4500 

70 AAGGTAGCTT TGGCAGCAGG GCTGATACTA TGAGGCTGAA ACAATCCTTG TGATGAAGTA 4560 

GATCATGCAG TGACATACAA AGACCAAGGA TTATGTATAT TTTTATATCT CTGTGGTTTT 4620 
GAAACTTTAG TACTTAGAAT TTTGGCCTTC TGCACTACTC TTTTGCTCTT ACGAACATAA 4680 
TGGACTCTTA AGAATGGAAA GGGATGACAT TTACCTATGT GTGCTGCCTC ATTCCTGGTG 4740 
AAGCAACTGC TACTTGTTCT CTATGCCTCT AAAATGATGC TGTTTTCTCT GCTAAAGGTA 4800 

75 AAAGAAAAGA AAAAAATAGT TGGAAAATAA GACATGCAAC TTGATGTGCT TTTGAGTAAA 4860 

TTTATGCAGC AGAAACTATA CAATGAAGGA AGAATTCTAT GGAAATTACA AATCCAAAAC 4920 
TCTATGATGA TGTCTTCCTA GGGAGTAGAG AAAGGCAGTG AAATGGCAGT TAGACCAACA 4980 
GAGGCTTGAA GGATTCAAGT ACAAGTAATA TTTTGTATAA AACATAGCAG TTTAGGTCCC 5040 
CATAATCCTC AAAAATAGTC ACAAATATAA CAAAGTTCAT TGTTTTAGGG TTTTTAAAAA 5100 

80 ACGTGTTGTA CCTAAGGCCA TACTTACTCT TCTATGCTAT CACTGCAAAG GGGTGATATG 5160 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TATGTATTAT 
TTAATACTAT 
ATCTGGACTG 
AGTATATCCT 
ACCTGTTCTT 
GAGATGACTG 
AGCGAGGCCT 
TTTTGAGTTG 
CATTTATTTT 



ATAAAAAAAA 
TTAATTTTTT 
AAGGTGTCCT 
TTCTAAACTG 
GTCTCTTTTT 
TAGCTTTTCG 
GCTCCATGGA 
ACCTGACTTC 
ATATTCTTGG 



AAACCCTTAA 
TAAAGATTTG 
TTTTAACAAC 
CCTAGTTTGT 
TCAGTCATTT 
TGCTCCACTG 
GTGCAGGACG 
CTTCTTGAAA 
TTGAAATAAA 



TGCACTGTTA 
TCTGTGTAGA 
AATTTAAAGT 
ATATTCCTAT 
TCTGCACGCA 
CGAGGTTTGT 
AGCTACTGCT 
TGACTGTTAA 
ATTTAATTGA 



SEQ ID NO:78 POQ3 Protein sequence: 
Proton Accession «: B AA82980 



TCTCCTAAAT 
CACTAAAAGT 
ACTTTTTATA 
AATTCCTATT 
TCCCCCTTTA 
GCTCAGAGCC 
TTGGAGCGAG 
AACTAAAATA 
CTTTG 



41 



ATTTAGTAAA 
ATTACACAAA 
TATGTTATGT 
TGTGAAGTGT 
TATGGTTATA 
GCTGCACCCC 
GGTTTCCTGC 
AATTACATTG 



SI 



5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 



1 11 21 31 

I I I I I I 

VKSLLYQILD GIHYLHANWV LHRDLKPANI LVMGEGPBRG RVKIADMGFA RLFNSPLKPL 
ADLDPVWTF WYRAFELLLG ARHYTKAIDI WAIGCIFAEL LTSEPIFHCR QEDIKTSNPF 
HHDQLDRXFS VMGFPADKDW EDIRKMPEYP TLQKDFRRTT YANSSLIKYM EKHKVKPDSK 
VFLLLQKLLT MDPTKRITSE QALQDPYFQE DPLPTLDVFA GCQIPYPKRE FLNEDDPEEK 
GDKNQQQQQN QHQQPTAPPQ QAAAPPQAPP PQQNSTQTNG TAGGAGAGVG GTGAGLQHSQ 
DSSLNQVPPN KKPRLGPSGA NSGGPVMPSD YQHSSSRLNY QSSVQGSSQS QSTLGYSSSS 
QQSSQYHPSH QAHRY 

SEQ (0 NO;79 P005 DNA SEQUENCE 

Nucleic Acid Accession*: XM.002922 

Cotfing sequence: 1 -2190 (undefined sequences correspond to start and stop cottons) 



60 
120 
180 

240 
300 
360 



1 
I 

ATGAATCCTT 
GAGGTACCAC 
AACTATCCAC 
TATGGAATGA 
ACCTCCACAT 
GCAGCCATTG 
TATGTGCTTG 
GTACACACAG 
AAACCCTGTG 
ACTAGATACT 
ATCACACCCA 
TTTGGAGTTC 
ATATACAATA 
TTTGCTATTT 
CTAGACTGGG 
AGGGTACTAT 
TCACGATGGA 
CCGGACCAGA 
TTTGTCATTT 
GCTGTTGGTA 
ATAAATGAAA 
CTGGCAGATG 
GAGTCCATCA 
AGCCAGGATT 
GTGCAGGAGA 
ATGATGGTAA 
AACACTTTGC 
GAAGACTATG 
TGTAGAACAG 
TATCTGTTTG 
ATTCCAGCCA 
GGGGAGGTCA 
ATGAAATCTG 
CTTGTTGTGG 
CTCCTGCTGG 
ACAGAGGATA 
AAACTAGAGA 



11 
I 

TCCAGAAAAA 
CTCGACCACC 
TGAGCATTGC 
AAGCTGTGCT 
CTATATACCA 
CTGACTCGTG 
GCCATGTGAT 
TCCTATCATT 
TGGCAGCTTT 
TCTCAGTCTT 
TGCTGAGAGG 
CAGGACTGCT 
AACCACCCCC 
CCAATCGTTT 
CAGCTGAGAA 
TCCTTTATAT 
CTTTGCAAGC 
TGCAGGTTCT 
ATCGTCTGGT 
TGATCCTAGC 
TGGCCCCAGC 
ATGAGGTGAA 
AATCCTTTCA 
TTCACTTCCA 
AGAACTGGTA 
AGGATACAGA 
ATAAAGATGT 
GTGTGTCTGC 
AAGATAAGAA 
TTATTACTAA 
ACAAAATGTC 
TGTTCTCTGT 
TGCTCCAGGC 
CACAGTTCAG 
TGATCTGCCT 
TGCGGGGTCC 
CCAAGAAGAC 



21 
I 

TGAGTCCAAG 
TAGCCCTCCA 
CTTCATTGTG 
GATCCTGTAT 
TGCCTTCAGC 
GTTGGGAAAA 
CAAGTCCTTG 
GATCGGCCTG 
TGGTGGAGAC 
CTACCTGTCC 
AGATGTGCAA 
CATGGTAATT 
TGAAGGAAAC 
CAAGAACCGT 
ATATCCAAAG 
CCCATTGCCC 
CATCAGGATG 
AAATCCCTTT 
CTCCAAGTGT 
GTGCCTGGCA 
CCAGTCAGGT 
GGTGACAGTG 
GAAAACACCA 
CCTGAAATAT 
CAGTCTTGTC 
AAGCAAAACA 
CAACATCTCC 
TTATAGAACT 
CTTTTCTCTG 
TAACACCAAT 
CATTGCGTGG 
CACAGGTCTT 
AGCTTGGCTA 
TGGCCTGGTA 
GATCTTCTCC 
AGCAGATAAG 
AAAACTCTGA 



SEQ ID NO-.BO P005 Protein sequence: 
Protein Accession*: XP.002922 



1 
I 

MNPFQKNESK 
YGMKAVLILY 
YVLGHVIKSL 
TRYFSVFYLS 
IYNKPPPEGN 
RVLFLYIPLP 
FVIYRLVSKC 
LADDEVKVTV 



11 

I 

ETLFSPVSIE 
PLYPLHWNED 
GALPIUGGOV 
INAGSLISTP 
IVAQVFKCIW 
KFWALLDQQG 
GINFSSLRKM 
VGNENNSLLI 



21 
I 

EVPPRPPSPP 
TSTSIYHAFS 
VHTVLSLIGL 
ITPMLRGDVO 
PAISNRFKNR 
SRWTLQAIRM 
AVGHILACLA 
ES1KSFQKTP 



31 
I 

GAAACTCTTT 
AAGAAGCCAT 
GTGAATGAAT 
TTCCTGTATT 
AGCCTCTGTT 
TTCAAGACAA 
GGTGCCTTAC 
AGTCTAATAG 
CAGTTTGAAG 
ATCAATGCAG 
TGTTTTGGAG 
GCACTTGTTG 
ATAGTGGCTC 
TCTGGAGACA 
CAGCTCATTA 
ATGTTCTGGG 
AATAGGAATT 
CTGGTTCTTA 
GGAATTAACT 
TTTGCAGTTG 
CCCCAGGAGG 
GTGGGAAATG 
CACTATTCCA 
CACAATTTGT 
ATTCGTGAAG 
ACCAATGGGA 
CTGAGTACAG 
GTGCAAAGAG 
AATTTGGGTC 
CAGGGTCTTC 
CAGCTACCAC 
GAGTTTTCTT 
TTGACAATTG 
CAGTGGGCCG 
ATCATGGGCT 
CACATTCCTC 



31 
I 

KKPSPTICGS 
SLCYFTPILG 
SLIALGTGGI 
CFGEDCYALA 
SGDIPKRQHW 
NRNLGFFVLQ 
FAVAAAVEIK 
HYSKLHLKTK 



41 

1 

TTTCACCTGT 
CTCCGACAAT 
TCTGCGAGCG 
TCCTGCACTG 
ATTTTACTCC 
TCATCTATCT 
CAATACTGGG 
CTTTGGGGAC 
AAAAACATGC 
GGAGCTTGAT 
AAGACTGCTA 
TGTTTGCAAT 
AAGTTTTCAA 
TTCCAAAGCG 
TGGATGTAAA 
CTCTTTTGGA 
TGGGGTTTTT 
TCTTCATCCC 
TCTCATCACT 
CGGCAGCTGT 
TTTTCCTACA 
AAAACAATTC 
AACTGCACCT 
CTCTCTACAC 
ATGGGAACAG 
TGACAACCGT 
ATACCTCTCT 
GAGAATACCC 
TTCTAGACTT 
AGGCCTGGAA 
AATATGCCCT 
ATTCTCAGGC 
CAGTTGGGAA 
AATTCATTTT 
ACTACTATGT 
ACATCCAGGG 



41 

I 

NYPLSIAFIV 
AAXACSWLGK 
KPCVAAFGGD 
FGVPGLLHVI 
LDWAAEKYFK 
PDQMQVLNPF 
INEMAPAQSG 
SQDFHFHLKY 



51 
I 

CTCCATTGAA 
CTGTGGCTCC 
CTTTTCCTAT 
GAATGAAGAT 
CATCCTGGGA 
CTCCTTGGTG 
AGGACAAGTG 
AGGAGGCATC 
AGAGGAACGG 
TTCTACATTT 
TGCATTGGCT 
GGGAAGCAAA 
ATGTATCTGG 
ACAGCACTGG 
GGCACTGACC 
TCAGCAGGGT 
TGTGCTTCAG 
GTTGTTTGAC 
TAGGAAAATG 
AGAGATAAAA 
AGTCTTGAAT 
TCTGTTGATA 
GAAAACAAAA 
TGAGCATTCT 
TATCTCCAGC 
GAGGTTTGTT 
CAATGTTGGT 
TGCAGTGCAC 
TGGTGCAGCA 
GATTGAAGAC 
GGTTACAGCT 
TCCCTCTAGC 
TATCATCGTG 
GTTTTCCTGC 
TCCTGTAAAG 
GAACATGATC 



51 
I 

VNEFCERFSY 
FKT1IYLSLV 



ALWFAKGSK 
QLIMDVKALT 
LVLIPIPLFD 
PQEVFLOVLN 
HNLSLYTEHS 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



60 
120 
180 
240 
300 
360 
420 
480 
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10 



VQEKNWYSLV IREDGNSISS MMVKDTESKT TNGMTTVRFV NTLHKDVNIS LSTDTSLNVG 540 
EDYGVSAYRT VQRGEYPAVH CRTEDKNFSL NLGLLDFGAA YLFVITNNTN QGLQAWKIED 600 
IPANXMSIAW QLPQYALVTA GEVMFSVTGL EFSYSQAPSS MKSVLQAAWL LTIAVGNIIV 660 
LWAQFSGLV QWAEFILFSC LLLVICLIFS IKGYYYVPVK TEDMRGPADK HIPHIQGNMI 720 
KLETKKTKL 

SEQ ID NO:81 P0O6 DNA SEQUENCE 

Nucleic Add Accession #: NMJ>20448 

Cotfng sequence: M221 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

ill III 

ATGGACGGAT CCCACAGCGC AGCCCTGAAG CTGCAGCAGC TGCCTCCCAC AAGTAGCTCC 60 

AGCGCCGTAA GCGAGGCCTC CTTCTCCTAC AAGGAAAACC TGATTGGCGC CCTCTTGGCG 120 

15 ATCTTCGGGC ACCTCGTGGT CAGCATTGCA CTTAACCTCC AGAAGTACTG CCACATCCGC 180 

CTGGCAGGCT CCAAGGATCC CCGGGCCTAT TTCAAGACCA AGACATGGTG GCTGGGCCTG 240 

TTCCTGATGC TTCTGGGCGA GCTGGGTGTG TTCGCCTCCT ACGCCTTCGC GCCGCTGTCA 300 

CTCATCGTGC CCCTCAGCGC AGTTTCTGTG. ATAGCTAGTG CCATCATAGG AATCATATTC 360 

ATCAAGGAAA AGTGGAAACC GAAAGACTTT CTGAGGCGCT ACGTCTTGTC CTTTGTTGGC 420 

20 TGCGGTTTGG CTGTCGTGGG TACCTACCTG CTGGTGACAT TCGCACCCAA CAGTCACGAG 480 

AAGATGACAG GCGAGAATGT CACCAGGCAC CTCGTGAGCT GGCCTTTCCT TTTGTACATG 540 

CTGGTGGAGA TCATTCTGTT CTGCTTGCTG CTCTACTTCT ACAAGGAGAA GAACGCCAAC 600 

AACATTGTCG TGATTCTTCT CTTGGTGGCG TTACTTGGCT CCATGACAGT GGTGACAGTC 660 

AAGGCCGTGG CTGGGATGCT TGTCTTGTCC ATTCAAGGGA ACCTGCAGCT TGACTACCCC 720 

25 ATCTTCTACG TGATGTTCGT GTGCATGGTG GCAACCGCCG TCTATCAGGC TGCGTTTTTG 780 

AGTCAAGCCT CACAGATGTA CGACTCCTCT TTGATTGCCA GTGTGGGCTA CATTCTGTCC 840 

ACAACCATTG CTATCACAGC AGGTGCAATA TTTTACCTGG ACTTCATCGG GGAGGACGTG 900 

CTGCACATCT GCATGTTTGC ACTGGGGTGC CTCATTGCAT TCTTGGGCGT CTTCTTAATC 960 

ACGCGTAACA GGAAGAAGCC CATTCCATTT GAGCCCTATA TTTCCATGGA TGCCATGCCA 1020 

30 GGTATGCAGA ACATGCACGA TAAAGGGATG ACTGTCCAGC CTGAACTTAA AGCTTCTTTT 1080 

TCCTATGGGG CTCTGGAAAA CAATGACAAC ATTTCTGAGA TCTACGCTCC TGCCACCCTG 1140 

CCAGTCATGC AAGAAGAGCA CGGCTCCAGA AGTGCCTCTG GGGTCCCCTA CCGAGTCCTA 1200 
GAGCACACCA AGAAGGA ATG A 

35 SEQ ID NO:82 PD06 Protein sequence 
Protein Accession #: NPJH5181 

1 11 21 31 41 51 

An I I I I II 

4U MDGSHSAALK LQQLPPTSSS SAVSEASFSY KENLIGALLA IFGHLWSIA LNLQKYCHIR 60 

LAG SKD PRAY FKTKTWWLGL FLMLLGELGV FASYAFAPLS LIVPLSAVSV IASAIIGIIF 120 

IKEKWKPKDF LRRYVLSFVG CGLAWGTYL LVTFAPNSHE KMTGENVTRH LVSWPFLLYM 180 

LVEIILFCLL LYFYKEKNAN NTWILLLVA LLGSMTWTV KAVAGMLVLS IQGNLQLDYP 240 

IFYVHFVCMV ATAVYQAAFL SQA5QHYDSS LIASVGYILS TTIAITAGAI FYLDFIGEDV 300 

45 LHICMFALGC LIAFLGVFLI TRNRKKPIPF EPYISMDAMP GMQNMHDKGM TVQPEEKASF 360 
SYGALENNDN ISEIYAPATL PVMQEEHGSR SASGVPYRVL EHTKKE 

SEQ ID NO:83 P008 DNA SEQUENCE 

Nucleic Acid Accession #: NM.032712 
50 Coding sequence: 555-908 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

III ill 

_ c CACTCATTAA GAACAGAGGA GGCTGCCTGT TACTCCTGGT GTTGCATCCC TCCAGACACT 60 

55 CTGCTGTTTC CTGCCTAGGC GTGGCTGCAG CCATGGCTAG GAAAGCGCTG CCACCCACCC 120 

ACCTGGGCCA GAGCTGGTTC TGCTCCTGCT GCAGGGACAC TGAGCTGGCT ATCTCGGCGC 180 

TTCGGGCAAG AACTGCAACA GGCTCTCCTG GGTCCTGCAG GTGTACAGCC GGGCCCCTGC 240 

CTTGTGCCTC AGCTCTCGAG AGCTGCTGCT GCCGGGTGAC CTGATCCAAC CTGATAAGGT 300 

GCCATCTTCA GCTACCACTG CAAGGCCCTG AGGGCAACAG CAGCACGGCA CTGCCCACCC 360 

60 GGCTGCTGAT GGCCTGGTGC CAGCTGGGAG TCCTCCCGGC ACTTCGAGGC CACTGAGCCA 420 

CCCTTCCAGC CCCAGCCCAC CATGGACAGG GGTATCCAGC TTCCTCCTCA ACCTCGTCCT 480 

CTGCCCCTGA GCCAGTGACG CCCAAGGACA TGCCTGTTAC CCAGGTCCTG TACCAGCACT 540 

AGCTGGTCAA GGGCATGACA GTGCTGGAGG CCGTCTTGGA GATCCAGGCC ATCACTGGCA 600 

, GCAGGCTGCT CTCCATGGTG CCAGGGCCCG CCAGGCCACC AGGCTCATGC TGGGACCCAA 660 

65 CCCAGTGCAC AAGGACTTGG CTGCTGAGCC ACACACCCAG GAGAAGGTGG ATAAGTGGGC 720 

TACCAAGGGC TTCCTGCAGG CTAGGGGAGG AGCCACCCCC GCTTCCCTAT TGTGACCAGG 780 

CCTATGGGGA GGAGCTGTCC ATACGCCACC GTGAGACCTG GGCCTGGCTC TCAAGGACAG 840 

ACACCGCCTG GCCTGGTGCT CCAGGGGTGA AGCAGGCCAG AATCCTGGGG GAGCTGCTCC 900 

rt TGGTTTGAGC TGCATTCAGG AAGTGCGGGA CATGGTAGGG GAGGCAAAAA GCCTTGGGCA 960 

70 CTACCCTCCC TGTGGAGCTG TTCGGTGTCC GTCGAGCTAG CCACACCCTG ACACCATGTT 1020 

CAAGGGTACC GGAAGAGAAG GGTGTCTGCC CCCAACCTCC CCTGTGGGTG TCACTGGCCA 1080 

GATGTCATGA GGGAAGCAGG CCTTGTGAGT GGACACTGAC CATGAGTCCC TGGGGGGAGT 1140 

GATCCCCCAG GCATCGTGTG CCATGTTGCA CTTCTGCCCA GGCRGCAGGG TGGGTGGGTA 1200 

CCATGGGTGC CCACCCCTCC ACCACATGGG GCCCCAAAGC ACTGCAGGCC AAGCAGGGCA 1260 

75 ACCCCACACC CTTGACATAA AAGCATCTTG AAGCTTTTAA AAAAAAAAAA AAAAAA 

fif Q m UOM PDOfl Protein sequence 
Protein Aoesstont: NPJ 16101 

80 1 11 21 31 .41 51 
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I I I I I I 

MTVLEAVLEI QAITGSRLLS MVPGPARPPG SCWDPTQCTR TWLLSHTPRR RWISGLPRAS 60 
CRLGEEPPPL PYCDQAYGEE LSIRHRETWA WLSRTDTAWP GAPGVKQARI LGELLLV 

SEQ ID NaS5 P0T1 DMA SEQUENCE 

Nucleic Acid Accession I: NMJXXJ693 

Coding sequence: 53-159 1 (undertined sequences correspond to start and stop codons) 

1 11 21 31 41 si 

I I I I I I 

AGCCGGTGCG CCGCAGACTA GGGCGCCTCG GGCCAGGGAG CGCGGAGGAG CCATGGCCAC 60 

CGCTAACGGG GCCGTGGAAA ACGGGCAGCC GGACGGGAAG CCGCCGGCCC TGCCGCGCCC 120 

CATCCGCAAC CTGGAGGTCA AGTTCACCAA GATATTTATC AACAATGAAT GGCACGAATC 180 

CAAGAGTGGG AAAAAGTTTG CTACATGTAA CCCTTCAACT CGGGAGCAAA TATGTGAAGT 240 

GGAAGAAGGA GATAAGCCCG ACGTGGACAA GGCTGTGGAG GCTGCACAGG TTGCCTTCCA 300 

GAGGGGCTCG CCATGGCGCC GGCTGGATGC CCTGAGTCGT GGGCGGCTGC TGCACCAGCT 360 

GGCTGACCTG GTGGAGAGGG ACCGCGCCAC CTTGGCCGCC CTGGAGACGA TGGATACAGG 420 

GAAGCCATTT CTTCATGCTT TTTTCATCGA CCTGGAGGGC TGTATTAGAA CCCTCAGATA 480 

CTTTGCAGGG TGGGCAGACA AAATCCAGGG CAAGACCATC CCCACAGATG ACAACGTCGT 540 

ATGCTTCACC AGGCATGAGC CCATTGGTGT CTGTGGGGCC ATCACTCCAT GGAACTTCCC 600 

CCTGCTGATG CTGGTGTGGA AGCTGGCACC CGCCCTCTGC TGTGGGAACA CCATGGTCCT 660 

GAAGCCTGCG GAGCAGACAC CTCTCACCGC CCTTTATCTC GGCTCTCTGA TCAAAGAGGC 720 

CGGGTTCCCT CCAGGAGTGG TGAACATTGT GCCAGGATTC GGGCCCACAG TGGGAGCAGC 780 

AATTTCTTCT CACCCTCAGA TCAACAAGAT CGCCTTCACC GGCTCCACAG AGGTTGGAAA 840 

ACTGGTTAAA GAAGCTGCGT CCCGGAGCAA TCTGAAGCGG GTGACGCTGG AGCTGGGGGG 900 

GAAGAACCCC TGCATCGTGT GTGCGGACGC TGACTTGGAC TTGGCAGTGG AGTGTGCCCA 960 

TCAGGGAGTG TTCTTCAACC AAGGCCAGTG TTGCACGGCA GCCTCCAGGG TGTTCGTGGA 1020 

GGAGCAGGTC TACTCTGAGT TTGTCAGGCG GAGCGTGGAG TATGCCAAGA AACGGCCCGT 1080 

GGGAGACCCC TTCGATGTCA AAACAGAACA GGGGCCTCAG ATTGATCAAA AGCAGTTCGA 1140 

CAAAATCTTA GAGCTGATCG AGAGTGGGAA GAAGGAAGGG GCCAAGCTGG AATGCGGGGG 1200 

CTCAGCCATG GAAGACAAGG GGCTCTTCAT CAAACCCACT GTCTTCTCAG AAGTCACAGA 1260 

CAACATGCGG ATTGCCAAAG AGGAGATTTT CGGGCCAGTG CAACCAATAC TGAAGTTCAA 1320 

AAGTATCGAA GAAGTGATAA AAAGAGCGAA TAGCACCGAC TATGGACTCA CAGCAGCCGT 1380 

GTTCACAAAA AATCTCGACA AAGCCCTGAA GTTGGCTTCT GCCTTAGAGT CTGGAACGGT 1440 

CTGGATCAAC TGCTACAACG CCCTCTATGC ACAGGCTCCA TTTGGTGGCT TTAAAATGTC 1500 

AGGAAATGGC AGAGAACTAG GTGAATACGC TTTGGCCGAA TACACAGAAG TGAAAACTGT 1560 

CACCATCAAA CTTGGCGACA AGAACCCC TG AA GGAAAGGC GGGGCTCCTT CCTCAAACAT 1620 

CGGACGGCGG AATGTGGCAG ATGAAATGTG CTGGAGGAAA AAAATGACAT TTCTGACCTT 1680 

CCCGGGACAC ATTCTTCTGG AGGCTTTACA TCTACTGGAG TTGAATGATT GCTGTTTTCC 1740 

TCTCACTCTC CTGTTTATTC ACCAGACTGG GGATGCCTAT AGGTTGTCTG TGAAATCGCA 1800 

GTCCTGCCTG GGGAGGGAGC TGTTGGCCAT TTCTGTGTTT CCCTTTAAAC CAGATCCTGG 1860 

AGACAGTGAG ATACTCAGGG CGTTGTTAAC AGGGAGTGGT ATTTGAAGTG TCCAGCAGTT 1920 

GCTTGAAATG CTTTGCCGAA TCTGACTCCA GTAAGAATGT GGGAAAACCC CCTGTGTGTT 1980 

CTGCAAGCAG GGCTCTTGCA CCAGCGGTCT CCTCAGGGTG GACCTGCTTA CAGAGCAAGC 2040 

CACGCCTCTT TCCGAGGTGA AGGTGGGACC ATTCCTTGGG AAAGGATTCA CAGTAAGGTT 2100 

TTTTGGTTTT TGTTTTTTGT TTTCTTGTTT TTAAAAAAAG GATTTCACAG TGAGAAAGTT 2160 

TTGGTTAGTG CATACCGTGG AAGGGCGCCA GGGTCTTTGT GGATTGCATG TTGACATTGA 2220 

CCGTGAGATT CGGCTTCAAA CCAATACTGC CTTTGGAATA TGACAGAATC AATAGCCCAG 2280 

AGAGCTTAGT CAAAGACGAT ATCACGGTCT ACCTTAACCA AGGCACTTTC TTAAGCAGAA 2340 

AATATTGTTG AGGTTACCTT TGCTGCTAAA GATCCAATCT TCTAACGCCA CAACAGCATA 2400 

GCAAATCCTA GGATAATTCA CCTCCTCATT TGACAAATCA GAGCTGTAAT TCACTTTAAC 2460 

AAATTACGCA TTTCTATCAC GTTCACTAAC AGCTTATGAT AAGTCTGTGT AGTCTTCCTT 2520 

TTCTCCAGTT CTGTTACCCA ATTTAGATTA GTAAAGCGTA CACAACTGGA AAGACTGCTG 2580 

TAATAACACA GCCTTGTTAT TTTTAAGTCC TATTTTGATA TTAATTTCTG ATTAGTTAGT 2640 

AAATAACACC TGGATTCTAT GGAGGACCTC GGTCTTCATC CAAGTGGCCT GAGTATTTCA 2700 

CTGGCAGGTT GTGAATTTTT CTTTTCCTCT TTGGGAATCC AAATGATGAT GTGCAATTTC 2760 

ATGTTTTAAC TTGGGAAACT GAAAGTGTTC CCATATAGCT TCAAAAACAA AAACAAATGT 2820 

GTTATCCGAC GGATACTTTT ATGGTTACTA ACTAGTACTT TCCTAATTGG GAAAGTAGTG 2880 

CTTAAGTTTG CAAATTAAGT TGGGGAGGGC AATAATAAAA TGAGGGCCCG TAACAGAACC 2940 

AGTGTGTGTA TAACGAAAAC CATGTATAAA ATGGGCCTAT CACCCTTGTC AGAGATATAA 3000 

ATTACCACAT TTGGCTTCCC TTCATCAGCT AACACTTATC ACTTATACTA CCAATAACTT 3060 

GTTAAATCAG GATTTGGCTT CATACACTGA ATTTTCAGTA TTTTATCTCA AGTAGATATA 3120 

GACACTAACC TTGATAGTGA TACGTTAGAG GGTTCCTATT CTTCCATTGT ACGATAATGT 3180 

CTTTAATATG AAATGCTACA TTATTTATAA TTGGTAGAGT TATTGTATCT TTTTATAGTT 3240 

GTAAGTACAC AGAGGTGGTA TATTTAAACT TCTGTAATAT ACTGTATTTA GAAATGGAAA 3300 

TATATATAGT GTTAGGTTTC ACTTCTTTTA AGGTTTACCC CTGTGGTGTG GTTTAAAAAT 3360 

CTATAGGCCT GGGAATTCCG ATCCTAGCTG CAGATCGCAT CCCACAATGC GAGAATGATA 3420 
AAATAAAATT GGATATTTGA GA 

SEP ID NO;86 PDT1 PROTEIN SEQUENCE 

Protein Accession* NP_000684 

1 11 21 31 41 51 

I I I I I I 

MATANGAVEN GQPDGKPPAL PRPIRNLEVK FTKIPINNEW KESKSGKKFA TCNPSTREQI 60 

CEVEEGDKPD VDKAVEAAQV AFQRGSPWRR LDALSRGRLL HQLADLVERD RATLAALETM 120 

DTGKPFLHAF FIDLEGCIRT LRYFAGWADK IQGKTIPTDD NWCFTRHEP IGVCGAITPW 180 

NFPLLMLVWK LAPALCCGNT KVLKPAEQTP LTALYLGSLI KEAGFPPGW NIVPGFGPTV 240 

GAAISSHPQI NKIAFTGSTE VGKLVKEAAS RSNLKRVTLE LGGKNPCIVC ADADLDLAVE 300 
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CAHQGVFFNQ GQCCTAASRV FVEEQVYSEF VRRSVEYAKK RPVGDPFDVK TEQGPQIDQK 360 

QFDKILELIE SGKKEGAKLE CGGSAMEDKG LFIKPTVPSE VTDNMRIAKE EIFGPVQPII, 420 

KFKSIEEVIK RANSTDYGI/T AAVFTKNLDK ALKLASAliES GTVWIKCYNA LYAQAPFGGF 480 
KHSGKGRELG EYALAEYTEV KTVTIKLGDK NP 

SEQ ID N0:87 PDV3 DNA SEQUENCE 

Nucleic Add Accession* NM.032642 

Ctxfing sequencer 184-1 263 (undeitined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

til III 

GACCATTAGC AGGCACCCAG GOCTGTCTTT GGCTCGGAAA CGGTGGCCCC CAATGTAGCC 60 

TAGTTTGAAC CTAGGAACTG CAGGACCAGA GAGATTCCAC TGGAGCCTGA TGGACGGGTG 120 

ACAGAGGGAA CCCTACTCTG GAAACTGTCA GTCCCAGGGC ACTGGGGAGG GCTGAGGCCG 180 

ACCATGCCCA GCCTGCTGCT GCTGTTCACG GCTGCTCTGC TGTCCAGCTG GGCTCAGCTT 240 

CTGACAGACG CCAACTCCTG GTGGTCATTA GCTTTGAACC CGGTGCAGAG ACCCGAGATG 300 

TTTATCATCG GTGCCCAGCC CGTGTGCAGT CAGCTTCCCG GGCTCTCCCC TGGCCAGAGG 360 

AAGCTGTGCC AATTGTACCA GGAGCACATG GCCTACATAG GGGAGGGAGC CAAGACTGGC 420 

ATCAAGGAAT GCCAGCACCA GTTCCGGCAG CGGCGGTGGA ATTGCAGCAC AGCGGACAAC 480 

GCATCTGTCT TTGGGAGAGT CATGCAGATA GGCAGCCGAG AGACCGCCTT CACCCACGCG 540 

GTGAGCGCCG CGGGCGTGGT CAACGCCATC AGCCGGGCCT GCCGCGAGGG CGAGCTCTCC 600 

ACCTGCGGCT GCAGCCGGAC GGCGCGGCCC AAGGACCTGC CCCGGGACTG GCTGTGGGGC 660 

GGCTGTGGGG ACAACGTGGA GTACGGCTAC CGCTTCGCCA AGGAGTTTGT GGATGCCCGG 720 

GAGCGAGAGA AGAACTTTGC CAAAGGATCA GAGGAGCAGG GCCGGGTGCT CATGAACCTG 780 

CAAAACAACG AGGCCGGTCG CAGGGCTGTG TATAAGATGG CAGACGTAGC CTGCAAATGC 840 

CACGGCGTCT CGGGGTCCTG CAGCCTCAAG ACCTGCTGGC TGCAGCTGGC CGAGTTCCGC 900 

AAGGTCGGGG ACCGGCTGAA GGAGAAGTAC GACAGCGCGG CCGCCATGCG CGTCACCCGC 960 

AAGGGCCGGC TGGAGCTGGT CAACAGCCGC TTCACCCAGC CCACCCCGGA GGACCTGGTC 1020 

TATGTGGACC CCAGCCCCGA CTACTGCCTG CGCAACGAGA GCACGGGCTC CCTGGGCACG 1080 

CAGGGCCGCC TCTGCAACAA GACCTCGGAG GGCATGGATG GCTGTGAGCT CATGTGCTGC 1140 

GGGCGTGGCT ACAACCAGTT CAAGAGCGTG CAGGTGGAGC GCTGCCACTG CAAGTTCCAC 1200 

TGGTGCTGCT TCGTCAGGTG TAAGAAGTGC ACGGAGATCG TGGACCAGTA CATCTGTAAA 1260 

TAGCCCGGAG GGCCTGCTCC CGGCCCCCCC TGCACTCTGC CTCACAAAGG TCTATATTAT 1320 

ATAAATCTAT ATAAATCTAT TTTATATTTG TATAAGTAAA TGGGTGGGTG CTATACAATG 1380 

GAAAGATGAA AATGGAAAGG AAGAGCTTAT TTAAGAGACG CTGGAGATCT CTGAGGAGTG 1440 

GACTTTGCTG GTTCTCTCCT CTTGGTGGGT GGGAGACAGG GCTTTTTCTC TCCCTCTGGC 1500 

GAGGACTCTC AGGATGTAGG GACTTGGAAA TATTTACTGT CTGTCCACCA CGGCCTGGAG 1560 

GAGGGAGGTT GTGGTTGGAT GGAGGAGATG ATCTTGTCTG GAAGTCTAGA GTCTTTGTTG 1620 

GTTAGAGGAC TGCCTGTGAT CCTGGCCACT AGGCCAAGAG GCCCTATGAA GGTGGCGGGA 1680 

ACTCAGCTTC AACCTCGATG TCTTCAGGGT CTTGTCCAGA ATGTAGATGG GTTCCGTAAG 1740 

AGGCCTGGTG CTCTCTTACT CTTTCATCCA CGTGCACTTG TGCGGCATCT GCAGTTTACA 1800 

GGAACGGCTC CTTCCCTAAA ATGAGAAGTC CAAGGTCATC TCTGGCCCAG TGACCACAGA 1860 

GAGATCTGCA CCTCCCGGAC TTCAGGCCTG CCTTTCCAGC GAGAATTCTT CATCCTCCAC 1920 

GGTTCACTAG CTCCTACCTG AAGAGGAAAG GGGGCCATTT GACCTGACAT GTCAGGAAAG 1980 

CCCTAAACTG AATGTTTGCG CCTGGGCTGC AGAAGCCAGG GTGCATGACC AGGCTGCGTG 2040 

GACGTTATAC TGTCTTCCCC CACCCCCGGG GAGGGGAAGC TTGAGCTGCT GCTGTCACTC 2100 

CTCCACCGAG GGAGGCCTCA CAAACCACAG GACGCTGCAA CGGGTCAGGC TGGCGGGCCC 2160 

GGCGTGCTCA TCATCTCTGC CCCAGGTGTA CGGTTTCTCT CTGACATTAA ATGCCCTTCA 2220 
TGGAAAAAAA AAAAAGAAAA AAAAAAAAAA AA 



SEQ 10 N0:88 PDV3 Protein sequence 
Protein Accession f: NPJ 1603 1 



l 11 .21 31 41 51 

I II I I I 

HPSLLLLFTA ALLSSWAQLL TDANSWWSLA LNPVQRPEMF IIGAQPVCSQ LPGLSPGQRK 60 
LCQLYQEHKA YIGEGAKTGI KECQHQFRQR RWKCSTADNA SVFGRVMQIG SRETAFTHAV 120 
SAAGWNAIS RACREGELST CGCSRTARPK DLPRDWLWGG CGDNVEYGYR FAKEFVDARE 180 
REKNFAKGSE EQGRVLHNLQ NNEAGRRAVY KMADVACKCH GVSGSCSLKT CWLQLAEFRK 240 
VGDRLKEKYD SAAAMRVTRK GRLELVNSRF TQPTPEDLVY VDPSPDYCLR NESTGSLGTQ 300 
GRLCNKTSEG MDGCELKCCG RGYNQFKSVQ VERCHCKFHW CCFVRCKKCT EIVDQYICK" 



SEQ 10 N0:B9 PDT9 DNA SEQUENCE 

Nucleic Add Accession*: NMJ>33280 

Coding sequence: 58-638 (undefined sequences conespond to start and slop codons) 

1 11 21 31 41 51 

I I I I I I 

GGCAGCCGTC TGTGCCACCC AGAGCCGGCG GGCCGCTAGG TCCCCGGAGA CCCTGCTATG 60 

GTGCGTGCGG GCGCCGTGGG GGCTCATCTC CCCGCGTCCG GCTTGGATAT CTTCGGGGAC 120 

CTGAAGAAGA TGAACAAGCG CCAGCTCTAT TACCAGGTTT TAAACTTCGC CATGATCGTG 180 

TCTTCTGCAC TCATGATATG GAAAGGCTTG ATCGTGCTCA CAGGCAGTGA GAGCCCCATC 240 

GTGGTGGTGC TGAGTGGCAG TATGGAGCCG GCCTTTCACA GAGGAGACCT CCTGTTCCTC 300 

ACAAATTTCC GGGAAGACCC AATCAGAGCT GGTGAAATAG TTGTTTTTAA AGTTGAAGGA 360 

CGAGACATTC CAATAGTTCA CAGAGTAATC AAAGTTCATG AAAAAGATAA TGGAGACATC 420 

AAATTTCTGA CTAAAGGAGA TAATAATGAA GTTGATGATA GAGGCTTGTA CAAAGAAGGC 480 

CAGAACTGGC TGGAAAAGAA GGACGTGGTG GGAAGAGCAA GAGGGTTTTT ACCATATGTT 540 

GGTATGGTCA CCATAATAAT GAATGACTAT CCAAAATTCA AGTATGCTCT TTTGGCTGTA 600 

ATGGGTGCAT ATGTGTTACT AAAACGTGAA TCCTAAAATG AGAAGCAGTT CCTGGGACCA 660 

GATTGAAATG AATTCTGTTG AAAAAGAGAA AAACTAATAT ATTTGAGATG TTCCATTTTC 720 
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TGTATAAAAG GCAACAGTGT GGAGATGTTT TTGTCTTGTC CAAATAAAAG ATTCACCAGT 780 
AAAAAAAAAA AAAA 

SEQ ID N0:90 PPT9P^?WVgngg 
Protein Accession!: NP_ 150596 

1 XI 21 31 41 51 

I I I I I I 

MVRAGAVGAH LPASGLDIFG DLKKMNKRQL YYQVLNFAMI VSSALMIHKG LIVLTGSESP 60 

IWVLSGSME PAFHRGDLLF LTNFREDPIR AGEIWFKVE GRDIPIVHKV IKVHEKDNGD 120 

IKFLTKGDNN EVDDRGLYKB GQNWLEKKDV VGRARGFLPY VGMVTIIMND YPKFKYALLA 180 
VMGAYVLLKR ES 

SEO 10 N0:91 PDV5 DNA SEQUENCE 

Nucleic Acid Accession* NMJU6590 

Coding sequence: 691-975 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GATTACTCAC ACAGTCTTGA AGATGCAATG TCAGCTATTT AGGACAGAAA CATCCAAGGC 60 

CGTGTCAGAA CTCAATTACG ACTACATATG CATTAAGGCA GGAACTGGCA GGCCTCAGGG 120 

TACGCCAACT ATAGGACTCG TGCTTCTCGT ACGCTGGGCT ATAATCTATG AAACTGAGCT 180 

CCAGAGCCAG CCAATCACTT AGCTCCTCAT AACAAGTCTA ACTGGCTCTG GAAAGCTGAA 240 

AGGGCTGCAC TGGAACAACA CAGATGAGAT ATTCTACACA TTAATCTACT TATCTGGAAT 300 

CACTTTGCCT CTAAAGGCCA GAGAAAAATC ACAGCTTCCT TGTCGGAGGG GAAAAGGACA 360 

GGTGATCTGG GGAAAACGCA GCTACACCTG GAGCAAGGTC TCTTCCCGGC TTGGCAATCT 420 

CAGCTGTGCC GGCGCTACGG GACCCGAGCC GTCCCAGAAA CCAAAGGGCA GGCACGGCAG 480 

CAAACGCCTG AGTGCTGCTG CCTTCGGTGA CTATATGAGA ATGGAAACTT CTAAGGAAGC 540 

CAGGTTGTTA GAATTGTTAC CCCCTTTACT CAGAGATAAC ATAGATTATC CAGGCTGAGA 600 

TGGAAAACAA GCCCTTTATT GAATTTTCAA CACAGACTCC CTGCTTCTCA TCTCCTTAAT 660 

AAAATTTCAT TAAAATCCCC TTGAACTCCC ATGT TCAAAT CTCCATTTGT TGACAGACAA 720 

AGCCAACAAT ACTCTAAACT GAGGCCTGCA AGTCATTTCA TTTGTATTTT TGTCCAGAAA 780 

TTTCCCATAG GAAGACTTCA CCTCCTACAA CTCCGAAGAA AACCCTTACT GTCCAAGACC 840 

GTCACCAGCA ACCATCCGCA GTCATTCAAG TGGAAGCTTT CACAGCTTTT GTACATTCTC 900 

TGTGTCAATA TACAACTGAG TTACAGACTG TCCCCTGGCT CCCTGACCCT TACAAACACT 960 

AAAAGTTTTG TTTGACTCAA CTTCAAGCTG CTCATCTGTT AGTAAGTGAT GTTCACTCCA 1020 

GAACACATTC ATGATGAGAA CTTTCTAAAA GACCAGCACT GCTCTTCCCC TCCTATAATC 1080 

ATAATAATCA TGATAACCTG AAACATGTTA CTGGGACTCG ACATTTTTCT GGGGATTGAA 1140 

ATCTTTAGTC CTTGGAGCTG TCACATAGCA GGGGCAACCT CACACTGAAA CAAAGGAAGT 1200 

GATGTCCCAT TATTATCCAC CCTGAGCCAC CATAATATGC TGTTTACATT TATTTTCTTC 1260 

AGCCTGTGCA AAACAAAGCA ATGGAAAAGG AAACTAAAAA ATATACATAC TAGTACCATT 1320 

ATCTTCTTTT GCCTAAAATT ACTAATGCAC CACGTCAGTC TGCTTCCTTC AGGCATCATT 1380 

CTCAATTCAT CAGGACTTGT ATTAGCAGGT TCTGGCTAGA GAGACTATCT CCTGTCATCA 1440 

CGATCAATTA ATGTTTTCTG GTGATCACAT CAGGCCCTAT CTAAGAAGCT CATGGTATAC 1500 

AAGGGTCACC CAAATAGCTG AGTGCAGTCC TTGCTCATAT TTCCTTCATC TTAACCCCGC 1560 

AAACAAGAAT TAAGATGATC CCAATAAAAG AAAAATTGCT CAGGAAACTG AACCTTTTTC 1620 

TGAACCAAGC ACTGTCAGCA AATCTCAGGT ATTAGAGCAA CTATGGTTGA TTGAAAAGTG 1680 

TCTCAAAATC TGGGCCAAGA ATGATTGCTA GGTCCATAAG CTAATTTGTC TGGCCTTGCC 1740 

ATTTACGTAA GCCAAAGAAA GTCACTCATG AGTAAACTAT AGAAAACGTT CAGACCCATC 1800 

CTGTTAGTAT GTCAAATCAA CTAAGACTGG CAGGGTATTA ACTCCATTCC AGGTGACATG 1860 

GATAAAGAGC CCCATTATTT TCACAGTGCC AGCCTCTACC TAAGGAAACC CTAGACCTTG 1920 

GAACCAGTTT CCTGGTAGGG AACTGCTGAC AGTTTCAATG CTGACAGTTG GAGCCAATGC 1980 

CTCATAGTGT AAACTGAAAG AAAAATAGTT GCTTTTTAAA ATGTCAGCAA GAAGGCCTGC 2040 

CTCATCTTAA CAAAGCAAAA AAAAATGCTT TAATTCAAAT TAAAAATCAT GATACTAAAA 2100 
AAAAAAAA 

$MPNQ:??PPV5Pfot?in sequent 
Protein Accession*: NP.057674 

1 11 21 31 41 51 

I I I I I I ' 

MQCQLFRTET SKA VSELNYD YICIKAGTGR PQGTPTIGLV LLVRWAIJYE THIjQSQPIT 

SEQ ID NO:93 PEE6 DNA SEQUENCE 

Nucleic Add Accession #: NMJM)2606 

Coding sequence: 61-1842 {undefined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I ! 

CGCGGCGGCT GGCGTCGGGA AAGTACAGTA AAAAGTCCGA GTGCAGCCGC CGGGCGCAGG 60 

ATGGGATCCG GCTCCTCCAG CTACCGGCCC AAGGCCATCT ACCTGGACAT CGATGGACGC 120 

ATTCAGAAGG TAATCTTCAG CAAGTACTGC AACTCCAGCG ACATCATGGA CCTGTTCTGC 180 

ATCGCCACCG GCCTGCCTCG GAACACGACC ATCTCCCTGC TGACCACCGA CGACGCCATG 240 

GTCTCCATCG accccaccat gcccgcgaat tcagaacgca CTCCGTACAA AGTGAGACCT 300 

GTGGCCATCA AGCAACTCTC CGCTGGTGTC GAGGACAAGA GAACCACAAG CCGTGGCCAG 360 

TCTGCTGAGA GACCACTGAG GGACAGACGG GTTGTGGGCC TGGAGCAGCC CCGGAGGGAA 420 

GGAGCATTTG AAAGTGGACA GGTAGAGCCC AGGCCCAGAG AGCCCCAGGG CTGCTACCAG 480 

GAAGGCCAGC GCATCCCTCC AGAGAGAGAA GAATTAATCC AGAGCGTGCT GGCGCAGGTT 540 

GCAGAGCAGT TCTCAAGAGC ATTCAAAATC AATGAACTGA AAGCTGAAGT TGCAAATCAC 600 

TTGGCTGTCC TAGAGAAACG CGTGGAATTG GAAGGACTAA AAGTGGTGGA GATTGAGAAA 660 
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TGCAAGAGTG ACATTAAGAA 
TGCCCCTGTA AGTACAGTTT 
OCCACTTACC CCAAGTACCT 
TTTGACGTCT GGCTTTGGGA 
GACCTCGGGC TGGTCAGGGA 
TGTGTCCACG ACAACTACAG 
GCCCAGATGA TGTACAGCAT 
GATATCCTGA TCCTAATGAC 
AACACGTACC AGATCAATGC 
CTGGAGAACC ACCACTGCGC 
TTCTCCAACA TCCCACCTGA 
TTGGCCACTG ACATGGCAAG 
AATTTTGACT ACAGCAACGA 
TGTGATATCT CTAACGAGGT 
TTAGAGGAAT ATTTTATGCA 
TTCATGGACC GAGACAAAGT 
CTGATCCCAA TGTTTGAAAC 
CAGCCACTTT GGGAATCCCG 
AAAGAGTTAC AGAAGAAGAC 
AGAAGCAGAG ATGTGAAAAA 
CTGCAGTTCT GGACGGGCTG 
TGGGCACCTG GCACCACAAG 
AAAAAAAAAA A 

SEQ ID HO'S* PEE6 Protein sequence 
Protein Accession!: NP_002597 

1 11 21 31 41 51 

I I I I I I 

MGSGSSSYRP KAIYLDIDGR IQKVIFSKYC NSSDIMDLFC IATGLPRNTT ISLLTTDDAM 60 
VSIDPTMPAN SERTPYKVRP VAIKQLSAGV EDKRTTSRGQ SAERPLRDRR WGLEQPRRE 120 
GAFBSGQVEP RPREPQGCYQ EGQRIPPERE ELIQSVLAQV AEQFSRAFKI NELKAEVANH 180 
LAVLBKRVEL EGLKWEIEK CKSDIKKMRE ELAARSSRTN CPCKYSFLDN HKKLTPRRDV 240 
PTYPKYLLSP ETIEALRKPT FDVWLWEPNE MLSCLEHMYH DLGLVRDFSI NPVTLRRWLF 300 
CVHDNYRNNP FHNFRHCFCV AQMMYSMVWL CSLQEKFSQT DILILMTAAI CHDLDHPGYN 360 
NTYQINARTE LAVRYNDISP LENHHCAVAF QILAEPECNI FSNIPPDGPK QIRQGMITLI 420 
LATDKARHAE IMDSFKEKME NFDYSNEEHM TLLKMILIKC CDISNEVRPM EVAEPWVDCL 480 
LEEYFMQSDR EKSEGLPVAP FMDRDKVTKA TAQIGFIKFV LIPMFETVTK LFPMVEEIML 540 
QPLWESRDRY EELKRIDDAH KBLQKKTDSL TSGATEKSRE RSRDVKNSEG DCA 

SEQ ID NO:95 PEG4 ONA SEQUENCE 

ttocldc Acid Accession #: none 

Cooing sequence: 41-559 {underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

CAGTCACAGG CGAGAGCCYT GGGATGCACC GGCCAGAGGC ATGCTGCTGC TGCTCACGCT . 60 

TGCCCTCCTG GGGGGCCCCA CCTGGGCAGG GAAGATGTAT GGCCCTGGAG GAGGCAAGTA 120 

TTTCAGCACC ACTGAAGACT ACGACCATGA AATCACAGGG CTGCGGGTGT CTGTAGGTCT 180 

TCTCCTGGTG AAAAGTGTCC AGGTGAAACT TGGAGACTCC TGGGACGTGA AACTGGGAGC 240 

CTTAGGTGGG AATACCCAGG AAGTCACCCT GCAGCCAGGC GAATACATCA CAAAAGTCTT 300 

TGTCGCCTTC CAAGCTTTCC TCCGGGGTAT GGTCATGTAC ACCAGCAAGG ACCGCTATTT 360 

CTATTTTGGG AAGCTTGATG GCCAGATCTC CTCTGCCTAC CCCAGCCAAG AGGGGCAGGT 420 

GCTGGTGGGC ATCTATGGCC AGTATCAACT CCTTGGCATC AAGAGCATTG GCTTTGAATG 480 

GAATTATCCA CTAGAGGAGC CGACCACTGA GCCACCAGTT AATCTCACAT ACTCAGCAAA 540 

CTCACCCGTG GGTCGCTAGG GTGGGGTATG GGGCCATCCG AGCTGAGGCC ATCTGTGTGG 600 

TGGTGGCTGA TGGTACTGGA GTAACTGAGT CGGGACGCTG AATCTGAATC CACCAATAAA 660 
TAAAGCTTCT GCAGAATCAG TGAAAAAAAA A 

SEQ ID NCfcM PEG4 Protein sequence 
Protein Accession ff: FGENESH predicted 

1 11 21 31 41 51 

I I I I I I 

MUXLTLALL GGPTWAGKMY GPGGGKYFST TEDYDHEITG LRVSVGLLLV K5VQVKLGDS 60 
WDVKLGALGG NTQEVTLQPG EYTTKVFVAF QAFLRGKVMY TSKDRYFYFG KLDGQISSAY 120 
PSQEGQVLVG IYGQYQLLGI KSIGFEWNYP LEEPTTEPFV NLTYSANSPV GR 



GATGAGGGAG GAGCTGGCGG CCAGAAGCAG CAGGACCAAC 720 

TTTGGATAAC CACAAGAAGT TGACTCCTCG ACGCGATGTT 780 

GCTCTCTCCA GAGACCATCG AGGCCCTGCG GAAGCCGACC 840 

GCCCAATGAG ATGCTGAGCT GCCTGGAGCA CATGTACCAC 900 

CTTCAGCATC AACCCTGTCA CCCTCAGGAG GTGGCTGTTC 960 

AAACAACCCC TTCCACAACT TCCGGCACTG CTTCTGCGTG 1020 

GGTCTGGCTC TGCAGTCTCC AGGAGAAGTT CTCACAAACG 1080 

AGCGGCCATC TGCCACGATC TGGACCATCC CGGCTACAAC 1140 

CCGCACAGAG CTGGCGGTCC GCTACAATGA CATCTCACCG 1200 

CGTGGCCTTC CAGATCCTCG CCGAGCCTGA GTGCAACATC 1260 

TGGGTTCAAG CAGATCCGAC AGGGAATGAT CACATTAATC 1320 

ACATGCAGAA ATTATGGATT CTTTCAAAGA GAAAATGGAG 1380 

GGAGCACATG ACCCTGCTGA AGATGATTTT GATAAAATGC 1440 

CCGTCCAATG GAAGTCGCAG AGCCTTGGGT GGACTGTTTA 1500 

GAGCGACCGT GAGAAGTCAG AAGGCCTTCC TGTGGCACCG 1560 

GACCAAGGCC ACAGCCCAGA TTGGGTTCAT CAAGTTTGTC 1620 

AGTGACCAAG CTCTTCCCCA TGGTTGAGGA GATCATGCTG 1680 

AGATCGCTAC GAGGAGCTGA AGCGGATAGA TGACGCCATG 1740 

TGACAGCTTG ACGTCTGGGG CCACCGAGAA GTCCAGAGAG 1800 

CAGTGAAGGA GACTGTGC CT GA GGAAAGCG GGGGGCGTGG 1860 

GCCGAGCTGC GCGGGATCCT TGTGCAGGGA AGAGCTGCCC 1920 

ACCATGTTTT CTAAGAACCA TTTTGTTCAC TGATACAAAA 1980 



SEQ ID NO:97 PEL9 DNA SEQUENCE 

Nucleic Acid Accession*: NM.006953 

Coding sequence: 33-89S(underiined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

CCGTTCCGCG CTCTGGCGGC TCCTCCCGGG CGATGCCTCC GCTCTGGGCC CTGCTGGCCC 60 

TCGGCTGCCT GCGGTTCGGC TCGGCTGTGA ACCTGCAGCC CCAACTGGCC AGTGTGACTT 120 

TCGCCACCAA CAACCCCACA CTTACCACTG TGGCCTTGGA AAAGCCTCTC TGCATGTTTG 180 

ACAGCAAAGA GGCCCTCACT GGCACCCACG AGGTCTACCT GTATGTCCTG GTCGACTCAG 240 

CCATTTCCAG GAATGCCTCA GTGCAAGACA GCACCAACAC CCCACTGGGC TCAACGTTCC 300 
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TACAAACAGA GGGTGGGAGG ACAGGTCCCT ACAAAGCTGT GGCCTTTGAC CTGATCCCCT 360 

GCAGTGACCT GCCCAGCCTG GATGCCATTG GGGATGTGTC CAAGGCCTCA CAGATCCTGA 420 

ATGCCTACCT GGTCAGGGTG GGTGCCAACG GGACCTGCCT GTGGGATCCC AACTTCCAGG 480 

GCCTCTGTAA CGCACCCCTG TCGGCAGCCA CGGAGTACAG GTTCAAGTAT GTCCTGGTCA 540 

ATATGTCCAC GGGCTTGGTA GAGGACCAGA CCCTGTGGTC GGACCCCATC CGCACCAACC 600 

AGCTCACCCC ATACTCGACG ATCGACACGT GGCCAGGCCG GCGGAGCGGA GGCATGATCG 660 

TCATCACTTC CATCCTGGGC TCCCTGCCCT TCTTTCTACT TGTGGGTTTT GCTGGCGCCA 720 

TTGCCCTCAG CCTCGTGGAC ATGGGGAGTT CTGATGGGGA AACGACTCAC GACTCCCAAA 7 BO 

TCACTCAGGA GGCTGTTCCC AAGTCGCTGG GGGCCTCGGA GTCTTCCTAC ACGTCCGTGA 840 

ACCGGGGGCC GCCACTGGAC AGGGCTGAGG TGTATTCCAG CAAGCTCCAA GACTGAGCCC 900 

AGCACCACCC CTGGGCAGCA GCATCCTCCT CTCTGGCCTT GCCCCAGGCC CTGCAGCGGT 960 

GGTTGTCACA CCCTGACTTC AGGGAAGGTG AAACAGGGCT TGTCCCTCCA ACTGCAGGAA 1020 
AACCCTTAAT AAAATCTTCT GATGAGTTCT AAAAAAAAA 

SEQ ID NO.98 PEL9 Protein sequence 
Protein Accession •: NP_008884 

1 11 21 31 41 51 

1)1111 

HPPLWALLAL GCLRFGSAVN LQPQLASVTF ATNNPTLTTV ALEKPLCMFD SKEALTGTHE 60 

VYLYVLVDSA ISRNASVQDS TNTPLGSTFL QTEGGRTGPY KAVAFDLIPC SDLPSLDAIG 120 

DVSKASQILN AYLVRVGANG TCLWDPNFQG LOJAPLSAAT EYRFKYVLVN KSTGLVEDQT 180 

LWSDPIRTNQ LTPYSTIDTW PGRRSGGMIV ITSILGSLPF FLLVGFAGAI ALSLVDMGSS 240 
DGETTHDSQI TQEAVPKSLG ASESSYTSVN RGPPLDRABV YSSKLQD 

SEQ ID NO:99 PEN1 DMA SEQUENCE 

Nxleic Acid Accession* NMJH239I 

Coding sequence: 416-1423 {underlined sequences correspond lo start and stop codons) 

1 11 21 31 41 51 

1 I 1 I 1 I 

GTCTGACTTC CTCCCAGCAC ATTCCTGCAC TCTGCCGTGT CCACACTGCC CCACAGACCC 60 

AGTCCTCCAA GCCTGCTGCC AGCTCCCTGC AAGCCCCTCA GGTTGGGCCT TGCCACGGTG 120 

CCAGCAGGCA GCCCTGGGCT GGGGGTAGGG GACTCCCTAC AGGCACGCAG CCCTGAGACC .180 

TCAGAGGGCC ACCCCTTGAG GGTGGCCAGG CCCCCAGTGG CCAACCTGAG TGCTGCCTCT 240 

GCCACCAGCC CTGCTGGCCC CTGGTTCCGC TGGCCCCCCA GATGCCTGGC TGAGACACGC 300 

CAGTGGCCTC AGCTGCCCAC ACCTCTTCCC GGCCCCTGAA GTTGGCACTG CAGCAGACAG 360 

CTCCCTGGGC ACCAGGCAGC TAACAGACAC AGCCGCCAGC CCAAACAGCA GCGGCATGGG 420 

CAGCGCCAGC CCGGGTCTGA GCAGCGTATC CCCCAGCCAC CTCCTGCTGC CCCCCGACAC 480 

GGTGTCGCGG ACAGGCTTGG AGAAGGCGGC AGCGGGGGCA GTGGGTCTCG AGAGACGGGA 540 

CTGGAGTCCC AGTCCACCCG CCACGCCCGA GCAGGGCCTG TCCGCCTPCT ACCTCTCCTA 600 

CTTTGACATG CTGTACCCTG AGGACAGCAG CTGGGCAGCC AAGGCCCCTG GGGCCAGCAG 660 

TCGGGAGGAG CCACCTGAGG AGCCTGAGCA GTGCCCGGTC ATTGACAGCC AAGCCCCAGC 720 

GGGCAGCCTG GACTTGGTGC CCGGCGGGCT GACCTTGGAG GAGCACTCGC TGGAGCAGGT 780 

GCAGTCCATG GTGGTGGGCG AAGTGCTCAA GGACATCGAG ACGGCCTGCA AGCTGCTCAA 840 

CATCACCGCA GATCCCATGG ACTGGAGCCC CAGCAATGTG CAGAAGTGGC TCCTGTGGAC 900 

AGAGCACCAA TACCGGCTGC CCCCCATGGG CAAGGCCTTC CAGGAGCTGG CGGGCAAGGA 960 

GCTGTGCGCC ATGTCGGAGG AGCAGTTCCG CCAGCGCTCG CCCCTGGGTG GGGATGTGCT 1020 

GCACGCCCAC CTGGACATCT GGAAGTCAGC GGCCTGGATG AAAGAGCGGA CTTCACCTGG 1080 

GGCGATTCAC TACTGTGCCT CGACCAGTGA GGAGAGCTGG ACCGACAGCG AGGTGGACTC 1140 

ATCATGCTCC GGGCAGCCCA TCCACCTGTG GCAGTTCCTC AAGGAGTTGC TACTCAAGCC 1200 

CCACAGCTAT GGCCGCTTCA TTAGGTGGCT CAACAAGGAG AAGGGCATCT TCAAAATTGA 1260 

GGACTCAGCC CAGGTGGCCC GGCTGTGGGG CATCCGCAAG AACCGTCCCG CCATGAACTA 1320 

CGACAAGCTG AGCCGCTCCA TCCGCCAGTA TTACAAGAAG GGCATCATCC GGAAGCCAGA 1380 

CATCTCCCAG CGCCTCGTCT ACCAGTTCGT GGACCCCATC TGAGTGCCTG GCCCAGGGCC 1440 

TGAAACCCGC CCTCAGGGGC CTCTCTCCTG CCTGCCCTGC CTCAGCCAGG CCCTGAGATG 1500 

GGGGAAAACG GGCAGTCTGC TCTGCTGCTC TGACCTTCCA GAGCCCAAGG TCAGGGAGGG 1560 

GCAACCAACT GCCCCAGGGG GATATGGGTC CTCTGGGGCC TTCGGGACCA TGGGGCAGGG 1620 

GTGCTTCCTC CTCAGGCCCA GCTGCTCCCC TGGAGGACAG AGGGAGACAG GGCTGCTCCC 1680 

CAACACCTGC CTCTGACCCC AGCATTTCCA GAGCAGAGCC TACAGAAGGG CAGTGACTCG 1740 

ACAAAGGCCA CAGGCAGTCC AGGCCTCTCT CTGCTCCATC CCCCTGCCTC CCATTCTGCA 1800 

CCACACCTGG CATGGTGCAG GGAGACATCT GCACCCCTGA GTTGGGCAGC CAGGAGTGCC 1860 
CCCGGGAATG GATAATAAAG ATACTAGAGA ACTG 

SEQ ID NO. 1 00 PEN 1 Protein sequence 
Protein Accession I: NP_036523 

1 11 21 31 41 51 

I I I I I I 

MGSASPGLSS VSPSHLLLPP DTVSRTGLEK AAAGAVGLER RDWSPSPPAT PEQGLSAFYL 60 

SYFDMLYPED SSWAAKAPGA SSREEPPEEP EQCPVIDSQA PAGSLDLVPG GLTLEEHSLE 120 

QVQSMWGEV LKDIETACKL LNITADPMDW SPSNVQKWLL WTEHQYRLPP KGKAFQELAG 180 

KELCAMSEEQ FRQRSPLGGD VLHAHLDIWK SAAWMKERTS PGAIHYCAST SEESWTDSEV 240 

DSSCSGQPIH LWQFLKELLL KPHSYGRFIR WLNKEKGIFK IEDSAOVARL WGIRKNRPAM 300 
NYDKLSRSIR QYYKKGIIRK PDISQRLVYQ FVHPI 

SEQ ID NO:101 PEN3 DNA SEQUENCE 

Nucleic Add Accession*: NMJW0742 

Codmo sequence: 555-2144 (underlined sequences correspond to start and stop codons) 
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1 11 21 31 41 51 

I I I I I I 

GAGAGAACAG CGTGAGCCTG TGTGCTTGTG TGCTGAGCCC TCATCCCCTC CTGGGGCCAG 60 

GCTTGGGTTT CACCTGCAGA ATCGCTTGTG CTGGGCTGCC TGGGCTGTCC TCAGTGGCAC 120 

CTGCATGAAG CCGTTCTGGC TGCCAGAGCT GGACAGCCCC AGGAAAACCC ACCTCTCTGC 180 

AGAGCTTGCC CAGCTGTCCC CGGGAAGCCA AATGCCTCTC ATCTAAGTCT TCTGCTCGAC 240 

GGGGTGTCTC CTAAACCCTC ACTCTTCAGC CTCTGTTTGA CCATGAAATG AAGTGACTGA 300 

GCTCTATTCT GTACCTGCCA CTCTATTTCT GGGGTGACTT TTGTCAGCTG CCCAGAATCT 360 

CCAAGCCAGG CTGGTTCTCT GCATCCTTTC AATGACCTGT TTTCTTCTGT AACCACAGGT 420 

TCGGTGGTGA GAGGAAGCCT CGCAGAATCC AGCAGAATCC TCACAGAATC CAGCAGCAGC 480 

TCTGCTGGGG ACATGGTCCA TGGTGCAACC CACAGCAAAG CCCTGACCTG ACCTCCTGAT 540 

GCTCAGGAGA AGCCATGGGC CCCTCCTGTC CTGTGTTCCT GTCCTTCACA AAGCTCAGCC 600 

TGTGGTGGCT CCTTCTGACC CCAGCAGGTG GAGAGGAAGC TAAGCGCCCA CCTCCCAGGG 660 

CTCCTGGAGA CCCACTCTCC TCTCCCAGTC CCACGGCATT GCCGCAGGGA GGCTCGCATA 720 

CCGAGACTGA GGACCGGCTC TTCAAACACC TCTTCCGGGG CTACAACCGC TGGGCGCGCC 780 

CGGTGCCCAA CACTTCAGAC GTGGTGATTG TGCGCTTTGG ACTGTCCATC GCTCAGCTCA 840 

TCGATGTGGA TGAGAAGAAC CAAATGATGA CCACCAACGT CTGGCTAAAA CAGGAGTGGA 900 

GCGACTACAA ACTGCGCTGG AACCCCGCTG ATTTTGGCAA CATCACATCT CTCAGGGTCC 960 

CTTCTGAGAT GATCTGGATC CCCGACATTG TTCTCTACAA CAATGCAGAT GGGGAGTTTG 1020 

CAGTGACCCA CATGACCAAG GCCCACCTCT TCTCCACGGG CACTGTGCAC TGGGTGCCCC 1080 

CGGCCATCTA CAAGAGCTCC TGCAGCATCG ACGTCACCTT CTTCCCCTTC GACCAGCAGA 1140 

ACTGCAAGAT GAAGTTTGGC TCCTGGACTT ATGACAAGGC CAAGATCGAC CTGGAGCAGA 1200 

TGGAGCAGAC TGTGGACCTG AAGGACTACT GGGAGAGCGG CGAGTGGGCC ATCGTCAATG 1260 

CCACGGGCAC CTACAACAGC AAGAAGTACG ACTGCTGCGC CGAGATCTAC CCCGACGTCA 1320 

CCTACGCCTT CGTCATCCGG CGGCTGCCGC TCTTCTACAC CATCAACCTC ATCATCCCCT 1380 

GCCTGCTCAT CTCCTGCCTC ACTGTGCTGG TCTTCTACCT GCCCTCCGAC TGCGGCGAGA 1440 

AGATCACGCT GTGCATTTCG GTGCTGCTGT CACTCACCGT CTTCCTGCTG CTCATCACTG 1500 

AGATCATCCC GTCCACCTCG CTGGTCATCC CGCTCATCGG CGAGTACCTG CTGTTCACCA 1560 

TGATCTTCGT CACCCTGTCC ATCGTCATCA CCGTCTTCGT GCTCAATGTG CACCACCGCT 1620 

CCCCCAGCAC CCACACCATG CCCCACTGGG TGCGGGGGGC CCTTCTGGGC TGTGTGCCCC 1680 

GGTGGCTTCT GATGAACCGG CCCCCACCAC CCGTGGAGCT CTGCCACCCC CTACGCCTGA 1740 

AGCTCAGCCC CTCTTATCAC TGGCTGGAGA GCAACGTGGA TGCCGAGGAG AGGGAGGTGG 1800 

TGGTGGAGGA GGAGGACAGA TGGGCATGTG CAGGTCATGT GGCCCCCTCT GTGGGCACCC 1860 

TCTGCAGCCA CGGCCACCTG CACTCTGGGG CCTCAGGTCC CAAGGCTGAG GCTCTGCTGC 1920 

AGGAGGGTGA GCTGCTGCTA TCACCCCACA TGCAGAAGGC ACTGGAAGGT GTGCACTACA 1980 

TTGCCGACCA CCTGCGGTCT GAGGATGCTG ACTCTTCGGT GAAGGAGGAC TGGAAGTATG 2040 

TTGCCATGGT CATCGACAGG ATCTTCCTCT GGCTGTTTAT CATCGTCTGC TTCCTGGGGA 2100 

CCATCGGCCT CTTTCTGCCT CCGTTCCTAG CTGGAATGAT CTGACTGCAC CTCCCTCGAG 2160 

CTGGCTCCCA GGGCAAAGGG GAGGGTTCTT GGATGTGGAA GGGCTTTGAA CAATGTTTAG 2220 

ATTTGGAGAT GAGCCCAAAG TGCCAGGGAG AACAGCCAGG TGAGGTGGGA GGTTGGAGAG 2280 

CCAGGTGAGG TCTCTCTAAG TCAGGCTGGG GTTGAAGTTT GGAGTCTGTC CGAGTTTGCA 2340 

GGGTGCTGAG CTGTATGGTC CAGCAGGGGA GTAATAAGGG CTCTTCCGGA AGGGGAGGAA 2400 

GCGGGAGGCA GGCCTGCACC TGATGTGGAG GTACAGGCAG ATCTTCCCTA CCGGGGAGGG 2460 

ATGGATGGTT GGATACAGGT GGCTGGGCTA TTCCATCCAT CTGGAAGCAC ATTTGAGCCT 2520 

CCAGGCTTCT CCTTGACGTC ATTCGTCTCC TTCCTTGCTG CAAAATGGCT CTGCACCAGC 2580 

CGGCCCCCAG GAGGTCTGGC AGAGCTGAGA GCCATGGCCT GCAGGGGCTC CATATGTCCC 2640 
TACGCGTGCA GCAGGCAAAC AAGA 

SEQrDNO:102PEN3 Protein sentience 
Protein Accession #: NPj0OO733 

l 11 21 31 41 51 

I I I I I I 

MGPSCPVFLS FTKLSLWWLL LTPAGGEEAK RPPPRAPGDP LSSPSPTALP QGGSHTETED 60 

RLFKHLFRGY NRWARPVPNT SDWIVRFGL SIAQLIDVDE KNQMMTTNVW LKQEWSDYKL 120 

RWNPADPGNI TSLRVPSEHI WIPDIVLYNN ADGEFAVTHM TKAHLFSTGT VHWVPPAIYK 180 

SSCSIDVTFF PFDQQNCKMK FGSWTYDKAK IDLEQMEQTV DLKDYWBSGE WAIVNATGTY 240 

NSKKYDCCAE IYPDVTYAFV IRRLPLFYTI NLIIPCLLIS CLTVLVFYLP SDCGEKITLC 300 

ISVLLSLTVF LLLITEIIPS TSLVIPLIGE YLLFTMIFVT LSIV1TVFVL NVHHRSPSTH 360 

TMPHWVRGAL LGCVPRWLLM NRPPPPVELC HPLRLKLSPS YHWLESNVDA EEREVWEEE 420 

DRWACAGHVA PSVGTLCSHG HLHSGASGPK AEALLQEGEL LLSPHMQKAL EGVHYIADHL 480 
RSEDADSSVK EDWKYVAMVI DR1FLWLF1I VCFLGTIGLF LPPFLAGMI 

SEQ ID NO:103 PEU4 DNA SEQUENCE 

Nudelc Add Accession »: NMjO 18670 

Coding sequence: 87-893 (underlined satuences correspond to start and stop codons) 

i 11 21 31 41 51 

I I I I I I 

CACGAGGCTG GAAGGGGCCA CTTCACACCT CGGGCTCGGC ATAAAGCGGC CGCCGGCCGC 60 

CGGCCCCCAG ACGCGCCGCC GCTGCCATGG CCCAGCCCCT G^GCCCGCCG CTCTCCGAGT 120 

CCTGGATGCT CTCTGCGGCC TGGGGCCCAA CTCGGCGGCC GCCGCCCTCC GACAAGGACT 180 

GCGGCCGCTC CCTCGTCTCG TCCCCAGACT CATGGGGCAG CACCCCAGCC GACAGCCCCG 240 

TGGCGAGCCC CGCGCGGCCA GGCACCCTCC GGGACCCCCG CGCCCCCTCC GTAGGTAGGC 300 

GCGGCGCGCG CAGCAGCCGC CTGGGCAGCG GGCAGAGGCA GAGCGCCAGT GAGCGGGAGA 360 

AACTGCGCAT GCGCACGCTG GCCCGCGCCC TGCACGAGCT GCGCCGCTTT CTACCGCCGT 420 

CCGTGGCGCC CGCGGGCCAG AGCCTGACCA AGATCGAGAC GCTGCGCCTG GCTATCCGCT 480 

ATATCGGCCA CCTGTCGGCC GTGCTAGGCC TCAGCGAGGA GAGTCTCCAG CGOCGGTGCC 540 

GGCAGCGCGG TGACGCGGGG TCCCCTCGGG GCTGCCCGCT GTGCCCCGAC GACTGCCCCG 600 

CGCAGATGCA GACACGGACG CAGGCTGAGG GGCAGGGGCA GGGGCGCGGG CTGGGCCTGG 660 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TATCCGCCGT 
CTGCACCCGA 
AGGCGATGGA 
AGACCTGGAT 
CAACTGACGC 
CCTTGGCAGA 
GGGTGAGAGC 
ATAGGGCTAG 
TGAATAAACT 



CCGCGCCGGG 
GCCGCGCGAC 
GCCAAGCCCA 
GCCCCTCTCG 
CGTCTCTGTG 
CTGCCTTTCC 
CGTCCCCACC 
ACACTTTGAG 
GTACTGGTGT 



GCGTCCTGGG 



CCTCTGGAGT 
AGCACCGAGG 
TGGAAGAGGG 
GCGGCGGCCC 
GCAAGCAGGA 
CAAAAAAAAA 



GATCCCCGCC 
TGTTCGCCGA 
TCCTTCCGGG 
GGCTGCCTGA 
CTTTTTGGCC 
CACGGGCGAT 
TTCTCAGCCC 
GGCTCTGCCT 
AAAAAAAAAA 



TGCCTGCCCC 

CGACGTGCTG 
GGAGCCCAAG 
TCAGCACCTT 
CCCGACGGGG 
CTCCCTCCAT 
AATGTGAATT 
A 



GGAGCCCGAG 
CCGGAAGGGC 
GCTCTGTTGG 
TGACAAGGGA 
CGAAGTGGTT 
GCATTCCTGC 
GGAGGGACCC 
TATTTATTTG 



720 
780 
840 
900 
960 
1020 
1080 
1140 



Protein Accession*: NP_06114O 



1 11 21 31 41 51 

I I 1 I I I 

MAQPLCPPLS ESWMLSAAWG PTRRPPPSDK DCGRSLVSSP DSWGSTPADS PVASPARPGT 
LRDPRAPSVG RRGARSSRLG SGQRQSASER EKLRMRTLAR ALHELRRFLP PSVAPAGQSL 
TKIETLRLAI RYIGHLSAVL GLSEESLQRR CRQRGDAGSP RGCPLCPDDC PAOMQTRTQA 
EGQGQGRGLG LVSAVRAGAS WGSPPACPGA RAAPEPRDPP ALFAEAACPE GQAMEPSPPS 
PLLPGDVLAL LETWMPLSPL EWLPEEPK 

SEQ (D N0:105 PEU5 ONA SEQUENCE 

Nudec Acid Accession #: NM_0 17636 

Coding sequence: 324-3374 (underiined sequences correspond to start and stop codons) 



CCACGGAGAA 
ACAGCAATTT 
CACGCACATG 



AGAGCACAGG 
GTGTGGCTGT 
GTGTGGCCCC 
TCCCTGCGAG 
ACAACTACTC 
ACCGCTTCCG 
CTGGAATTGA 
GAATAGAGAA 
CTGCGGACTG 
GGCAAGGCGA 
TGCAGGCCCA 
AGGATGGGTC 
GCTCGGAGGC 
ACATTGCCCA 
CTTCCCTCAT 
ACGGCCTCAG 
CGCCCTCCAA 
AAGCCCCAGC 
TGAGGATGCT 
CTCACCCAGG 

TGTTGCTGAA 
CCTCAGCTCT 
AGGAGGCAGC 
TTGGCGAGTG 
CGCTCTGGGG 
TTGCCCAGGA 
CTACACCCAT 
TCATCACCTT 
ATAGTGTCAT 
TGGGGGTCCC 
GGTGCCTACG 
TGGTCAGCTA 
CGGCGCCGCC 
AGGAACTGCG 
CTGGCCATGC 
GCGACCTAGT 
TGTACCACCT 
TTCACATCTT 
TGAAGGACGT 
CCACGGAGGG 
TCTACCGTCC 
TCATGGAGCA 
AGGCGGGCAC 
TCCTGCTCGT 
TCGGCAAAGT 
GGGAATTCCA 
TCCTGCTCAG 
AGCATTTCCG 



11 

i 

GCCCACCGAT 
CCTCCGGCTC 
GGGCTTCCGT 
CCAGACCTGG 
AGCCTGGATT 
ACGGGACCAT 
CTGGGGTGTG 
GTACCGGTGG 
GGCCTTCTTC 
CTTGCGCCTG 
CATCCCTGTC 
CGCCACCCAG 
CCTGGCGGAG 
AGCCCGAGAT 
GGTGGAGAGG 
TGAGGAATTC 
CTCAGCCTAC 
GAGTGAACTC 
GGACGCCCTG 
CCTGGGCCAC 
CTCGCTCATC 
CCTAAAAGGG 
GCTGGGGAAG 
CCAGGGCTTC 
GGATGCTGGC 
CAGGGCACAG 
TGGGGCCTGT 
ACGGAGGAAA 
CTATCGCAGC 
GGATGCCACT 
TGGGGTACAG 
CTGGGCCCTG 
CAGGAAATCA 
TAATGGGGAA 
GCGCCAGTCG 
CCGCTGGTTC 
CCTGCTGTTC 
CGGCTCCCTG 
CCAGGGCCTG 
CTCACTGAGC 
GGCTCTCACC 
GGGCCGCACT 
CACGGTCAAC 
GTTCTTCTTC 
GCTCCTGAGG 
CTACCTGCAG 
CAGCAACTGC 
CTGCGTCTCC 
GGCCAACATC 
ACAGGGCAAC 
CTCTCGGCCC 
GCAATTGTGC 
GGTTTACCTT 



21 

I 

GCCTACGGAG 
TCTGACCGAA 
GCCCCGAACC 
CTGCAGGACC 
GTCACTGGGG 
CAGATGGCCA 
GTCCGGAATA 
CGCGGTGACC 
CTGGTGGACG 
GAGTCCTACA 
CTGCTCCTCC 
GCTCAGCTCC 
ACCCTGGAAG 
CGAATCAGGC 
ATTATGACCC 
GAGACCATAG 
CTGGATGAGC 
TTTCGGGGGG 
CTGAATGACC 
TTCCTGACCC 
CGCAACCTTT 
GGAGCTGCGG 
ATGTGCGCGC 
GGGGAGAGCA 
CTCGGGCAGG 
ATGGCCATGT 
TTGCTGCTCC 
GACCTGGCGT 
AGTGAGGTGA 
TGCCTCCAGC 
TCTCTGCTGA 
GTTCTCGCCT 
GAAGAGGAGC 
GGGCCTGTCG 
GGCCGTCCGG 
CACTTCTGGG 
TTGCTGCTTT 
GAGCTGCTGC 
AGCGGAGGCG 
CAGCGCCTGC 
TGCTTCCTCC 
GTCCTCTGCA 
AAACAGCTGG 
CTCTTCTTCC 
CCACGGGACA 
ATCTTCGGGC 
TCGTCGGAGC 
CAGTATGCCA 
CTGCTGGTCA 
AGCGATCTCT 
GCGCTGGCCC 
AGGCGACCCC 
TCTAAGGAAG 



31 
I 

AGCTGGACTT 
CGGATCCAGC 
TGGTGGTGTC 
TGCTGCGTCG 
GTCTGCACAC 
GCACTGGGGG 
GAGACACCCT 
CGGAGGACGG 
ACGGCACACA 
TCTCACAGCA 
TGATTGATGG 
CATGTCTCCT 
ACACTCTGGC 
GTTTCTTTCC 
GGAAGGAGCT 
TTTTGAAGGC 
TGCGTTTGGC 
ACATCCAATG 
GGCCTGAGTT 
CGATGCGCCT 
TGGACCAGGC 
AGCTCCGGCC 
CGAGGTACCC 
TGTATCTGCT 
CCCCCTGGAG 
ACTTCTGGGA 
GGGTGATGGC 
TCAAGTTTGA 
GGGCTGCCCG 
TGGCCATGCA 
CACAGAAGTG 
TCTTTTGCCC 
CCACACGGGA 
GGACGGCGGA 
GTTGCTGCGG 
GCGCGCCGGT 
TCTCGCGGGT 
TCTATTTCTG 
GGGGCAGCCT 
GCCTCTACCT 
TGGGC GTGGG 
TCGACTTCAT 
GGCCCAAGAT 
TCGGCGTGTG 
GTGACTTCCC 
AGATTCCCCA 
CCGGCTTCTG 
ACTGGCTGGT 
ACTTGCTCAT 
ACTGGAAGGC 
CGCCCTTTAT 
GGAGCCCCCA 
CCGAGCGGAA 



41 
I 

CACGGGGGCC 
TGCAGTTTAT 
AGTGCTGGGG 
TGGGCTGGTG 
GGGCATCGGC 
CACCAAGGTG 
CATCAACCCC 
GGTCCAGTTT 
CGGCTGCCTG 
GAAGACGGGC 
TGATGAGAAG 
CGTGGCTGGC 
CCCAGGGAGT 
CAAAGGGGAC 
CCTGACAGTC 
CCTTGTGAAG 
TGTGGCTTGG 
GCGGTCCTTC 
CGTGCGCTTG 
GGCCCAACTC 
GTCCCACAGC 
CCCTGACGTG 
CTCCGGGGGG 
CTCGGACAAG 
CGACCTGCTT 
GATGGGTTCC 
ACGCCTGGAG 
GGGGATGGGC 
CCTCCTCCTC 
AGCTGACGCC 
GTGGGGAGAT 
TCCACTCATC 
GGAGCTAGAG 
CCCAGCCGAG 
GGGCCGCTGC 
GACCATCTTC 
GCTGCTCGTG 
GGCTTTCACG 
CGCCAGCGGG 
CGCCGACAGC 
CTGCCGGCTG 
GGTTTTCACG 
CGTCATCGTG 
GCTGGTAGCC 
AAGTATCCTG 
GGAGGACATG 
GGCACACCCT 
GGTGCTGCTC 
TGCCATGTTC 
GCAGCGTTAC 
CGTCATCTCC 
GCCGTCCTCC 
GCTGCTAACG 



51 
I 

GGCCGCAAGC 
AGTCTGGTCA 
GGATCGGGGG 
CGGGCTGCCC 
CGGCATGTTG 
GTGGCCATGG 
AAGGGCTCGT 
CCCCTGGACT 
GGGGGCGAGA 
GTGGGAGGGA 
ATGTTGACGC. 
TCAGGGGGAG 
GGGGGAGCCA 
CTTGAGGTCC 
TATTCTTCTG 
GCCTGTGGGA 
AACCGCGTGG 
CATCTCGAAG 
CTCATTTCCC 
TACAGCGCGG 
GCAGGCACCA 
GGGCATGTGC 
GCCTGGGACC 
GCCACCTCGC 
CTTTGGGCAC 
AATGCAGTTT 
CCTGACGCTG 
GTTGACCTCT 
CGTCGCTGCC 

ATGGCCAGCA 
TACACCCGCC 
TTTGACATGG 
AAGACGCCGC 
GGGGGGCGCC 
ATGGGCAACG 
GATTTCCAGC 
CTGCTGTGCG 

TGGAACCAGT 
ACCCCGGGTT 
GTGCGGCTGC 
AGCAAGATGA 
TATGGCGTGG 
CGCCGCGTCT 
GACGTGGCCC 
CCTGGGGCCC 
CTCGTCATCT 
AGTTACACAT 
CGCCTCATCC 
CACTTGCGCC 
CCGGCCCTCG 
TGGGAATCGG 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
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TGCATAAGGA GAACTTTCTG CTGGCACGCG CTAGGGACAA GCGGGAGAGC GACTCCGAGC 3240 

GTCTGGAGCG CACGTCCCAG AAGGTGGACT TGGCACTGAA ACAGCTGGGA CACATCCGCG 3300 

AGTACGAACA GCGCCTGAAA GTGCTGGAGC GGGAGGTCCA GCAGTGTAGC CGCGTCCTGG 3360 

GGTGGGTGAC GTAGGCCGTT AGCAGCTCTG CCATGTTGCC CTCAGGTGGG CCGCCACCCC 3420 

TTGACCTGCA TGGGTCCAAA GAGTGAGCCA TGCTGGCGGA TTTTAAGGAG AAGCCCCCAC 3480 

AGGGGATTTT GCTCTTAGAG TAAGGCTCAT GTGGGCCTCG GCCCCCGCAC CTGGTGGCCT 3540 

TGTCCTTGAG GTGAGCCCCA TGTCCATCTG GGCCACTGTC AGGACCACCT TTGGGAGTGT 3600 

CATCCTTACA AACCACAGCA TGCCCGGCTC CTCCCAGAAC CAGTCCCAGC CTGGGAGGAT 3660 

CAAGGCCTGG ATCCCGGGCC GTTATCCATC TGGAGGCTGC AGGGTCCTTG GGGTAACAGG 3720 

GACCACAGAC CCCTCACCAC TCACAGATTC CTCACACTGG GGAAATAAAG CCATTTCAGA 3780 
GGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 



SEQ ID NO:106 PEU5 Protein secuence 
Protein Accession NP_0601 06 



1 11 21 31 41 51 

I I I I I I 

MASTGGTKW AMGVAFWGW RNRDTLINPK GSFPARYRWR GDPEDGVQFP LDYNYSAFFL 60 

VDDGTHGCLG GENRFRLRLE SYISQQKTGV GGTGIDIPVL LLLIDGDEKM LTRIENATQA 120 

QLPCLLVAGS GGAADCLAET LEDTLAPGSG GARQGEARDR IRRFFPKGDL EVLQAQVERI 180 

MTRKELLTVY SSEDGSEEFE TIVLKALVKA CGSSEA5AYL DELRLAVAWN RVDIAQSELP 240 

RGDIOWRSFH LEASLMDALL NDRPEFVRLL ISHGLSLGHF LTPMRLAQLY SAAPSNSLIR 300 

NLLDQASHSA GTKAPALKGG AAELRPPDVG HVLRMLLGKM CAPRYPSGGA WDPHPGQGFG 360 

ESMYLLSDKA TSPLSLDAGL GQAPWSDLLL WALLLNRAQM AKYFWEMGSN AVSSALGACL 420 

LLRVMARLEP OAEEAARRKD LAFKFEGMGV DLFGECYRSS EVRAARLLLR RCPLWGDATC 480 

LQLAMQADAR AFFAQDGVQS LLTQKWWGDM ASTTPIWALV LAFFCPPLIY TRLITFRKSE 540 

EEPTREELEF DMDSVINGEG PVGTADPAEK TPLGVPRQSG RPGCCGGRCG GRRCLRRWFH 600 

FWGAPVTIFM GNWSYLLFL LLFSRVLLVD FQPAPPGSLE LLLYFWAFTL LCEELRQGLS 660 

GGGGSLASGG PGPGHASLSQ RLRLYLADSW NQCDLVALTC FLLGVGCRLT PGLYHLGRTV 720 

LCIDFMVFTV RLLHIFTVNK QLGPKIVIVS KMMKDVFFFL FFLGVWLVAY GVATEGLLRP 780 

RDSDFPSILR RVFYRPYLQI FGQIPQEBMD VALMEHSNCS SEPGFWAHPP GAQAGTCVSQ 840 

YANWLWLLL, VIFLLVANIL LVNLLIAMFS YTFGKVQGNS DLYWKAQRYR LIREFHSRPA 900 

LAPPFIVISH LRLLLROLCR RPRSPQPSSP ALEHFRVYLS KEAERKLLTW ESVHKENFLL 960 
ARARDKRESD SERLERTSQK VDLALKQLGH IREYEQRLKV LEREVQQCSR VLGWVT 

SEQ 10 NO:107 PEW3 DNA SEQUENCE 

Nucleic Add Accession*: NM_005982 

Coring sequence: 276-1 130 (underlined sequences correspond to start and stopcodons) 

1 11 21 31 41 51 

I I I I I I 

GGTAGCAGCA TCCACCGGGC GGGAGGTCGG AGGCAGCAAG GCCTTAAAGG CTACTGAGTG 60 

CGCCGGCCGT TCCGTGTCCA GAACCTCCCC TACTCCTCCG CCTTCTCTTC CTTGGCCGCC 120 

CACCGCCAAG TTCCGACTCC GGTTTTCGCC TTTGCAAAGC CTAAGGAGGA GGTTAGGAAC 180 

AGCCGCGCCC CCCTCCCTGC GGCCGCCGCC CCCTGCCTCT CGGCTCTGCT CCCTGCCGCG 240 

TGCGCCTGGG CCGTGCGCCC CGGCAGGCGC CAGCCATGTC GATGCTGCCG TCGTTTGGCT 300 

TTACGCAGGA GCAAGTGGCG TGCGTGTGCG AGGTTCTGCA GCAAGGCGGA AACCTGGAGC 360 

GCCTGGGCAG GTTCCTGTGG TCACTGCCCG CCTGCGACCA CCTGCACAAG AACGAGAGCG 420 

TACTCAAGGC CAAGGCGGTG GTCGCCTTCC ACCGCGGCAA CTICCGTGAG CTCTACAAGA 480 

TCCTGGAGAG CCACCAGTTC TCGCCTCACA ACCACCCCAA ACTGCAGCAA CTGTGGCTGA 540 

AGGCGCATTA CGTGGAGGCC GAGAAGCTGC GCGGCCGACC CCTGGGCGCC GTGGGCAAAT 600 

ATCGGGTGCG CCGAAAATTT CCACTGCCGC GCACCATCTG GGACGGCGAG GAGACCAGCT 660 

ACTGCTTCAA GGAGAAGTCG AGGGGTGTCC TGCGGGAGTG GTACGCGCAC AATCCCTACC 720 

CATCGCCGCG TGAGAAGCGG GAGCTGGCCG AGGCCACCGG CCTCACCACC ACCCAGGTCA 780 

GCAACTGGTT TAAGAACCGG AGGCAAAGAG ACCGGGCCGC GGAGGCCAAG GAAAGGGAGA 840 

ACACCGAAAA CAATAACTCC TCCTCCAACA AGCAGAACCA ACTCTCTCCT CTGGAAGGGG 900 

GCAAGCCGCT CATGTCCAGC TCAGAAGAGG AATTCTCACC TCCCCAAAGT CCAGACCAGA 960 

ACTCGGTCCT TCTGCTGCAG GGCAATATGG GCCACGCCAG GAGCTCAAAC TATTCTCTCC 1020 

CGGGCTTAAC AGCCTCGCAG CCCAGTCACG GCCTGCAGAC CCACCAGCAT CAGCTCCAAG 1080 

ACTCTCTGCT CGGCCCCCTC ACCTCCAGTC TGGTGGACTT GGGGTCCTAA GTGGGGAGGG" 1140 

ACTGGGGCCT CGAAGGGATT CCTGGAGCAG CAACCACTGC AGCGACTAGG GACACTTGTA 1200 

AATAGAAATC AGGAACATTT TTGCAGCTTG TTTCTGGAGT TGTTTGCGCA TAAAGGAATG 1260 

GTGGACTTTC ACAAATATCT TTTTAAAAAT CAAAACCAAC AGCGATCTCA AGCTTAATCT 1320 
CCTCTTCTCT CCAACTCTTT CCACTTTTGC ATTTTCCTTC CCAATGCAGA GATCAGGG 



SEQ 10 NO:108 PEW3 Protein sequence 
Protein Accession I: NPJJ05973 

l 11 21 31 41 51 

I ! I I I I 

MSMLPSFGFT QEQVACVCEV LQQCGNLERL GRFLWSLPAC DHLHKNESVL KAKAWAFHR 60 
GNFRELYKIL ESHQFSPHNH PKLQQLWLKA HYVEAEKVRG RPLGAVGKYR VRRKPPLPRT 120 
IWDGEETSYC FKEKSRGVLR EWYAHNPYPS PREKRELAEA TGLTTTQVSN WFKNRRQRDR 180 
AAEAKERENT ENNNSSSNKQ NQLSPLEGGK PLMSSSEEEF SPPQSFDQNS VLLLQGNMCH 240 
ARSSNYSLPG LTASQPSHGL QTHQHQLQDS LLGPLTSSLV DLGS 



SEQ U) NO:109 PFJ8 DNA SEQUENCE 

Nucleic Acid Accession #: NM.005069 

Coding sequence: 57-2060 (underlined sequences correspond to start and stop codons) 

340 
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1 II 21 31 41 51 
I I I I I I 

GGGGCTCCGC GGGCCTGGAG CACGGCCGGG TCTAATATGC CCGGAGCCGA GCCGC GATG A 60 
AGGAGAAGTC CAAGAATGCG GCCAAGACCA GGAGGGAGAA GGAAAATGGC GAGTTTTACG 120 
AGCTTGCCAA GCTGCTCCCG CTGCCGTCGG CCATCACTTC GCAGCTGGAC AAAGCGTCCA 180 
TCATCCGCCT CACCACG AGC TACCTG AAG A TGCGCGCCGT CTTCCCCGAA GGTTTAGGAG 240 
ACGCGTGGGG ACAGCCGAGC CGCGCCGGGC CCCTGGACGG CGTCGCCAAG GAGCTGGGAT 300 
CGCACTTGCT GCAGACTTTG GATGGATTTG 1 1 1 1 IUTGGT AGCATCTGAT GGCAAAATCA 360 
TGTATATATC CGAGACCGCT TCTGTCCATT TAGGCTTATC CCAGGTGGAG CTCACGGGCA 420 
ACAGTATTTA TGAATACATC CATCCTTCTG ACCACGATG A GATGACCGCT GTCCTCACGG 480 
CCCACCAGCC GCTGCACCAC CACCTGCTCC AAGAGTATGA GATAGAGAGG TCGTTCTTTC 340 
TTCGAATGAA ATGTGTCTTG GCG AAAAGGA ACGCGGGCCT GACCTGCAGC GGATACAAGG 600 
TCATCCACTG CAGTGGCTAC TTGA AG ATCA GGCAGTATAT GCTGG ACATG TCCCTGTACG 660 
ACTCCTGCTA CCAGATTGTG GGGCTGGTGG CCGTGGGCCA GTCGCTGCCA CCCAGTGCCA 720 
TCACCGAG AT CAAGCTGTAC AGTAACATGT TC ATGTTC AG GGCCAGCCTT GACCTG A AGC 780 
TG ATATTCCT GG ATTCCAGG GTGACCGAGG TGACGGGTTA CGAGCCGCAG GACCTG ATCG 840 
AGAAGACCCT ATACCATCAC GTGCACGGCT GCGACGTGTT CCACCTCCGC TACGCACACC 900 
ACCTCCTGTT GGTGAAGGGC CAGGTCACCA CCAAGTACTA CCGGCTGCTG TCCAAGCGGG 960 
GCGGCTGGGT GTGGGTGCAG AGCTACGCCA CCGTGGTGCA CAACAGCCGC TCGTCCCGGC 1020 
CCCACTGCAT CGTGAGTGTC AATTATGTAC TCACGGAGAT TGAATACAAG GAACTTCAGC 1080 
TGTCCCTGG A GCAGGTGTCC ACTGCCAAGT CCCAGG ACTC CTGGAGGACC GCCTTGTCTA 1 140 
CCTCACAAGA AACTAGGAAA TTAGTGAAAC CCAAAAATAC CAAGATGAAG ACAAAGCTGA 1200 
GAACAAACCC TTACCCCCCA CAGCAATACA GCTCGTTCCA AATGGACAAA CTGGAATGCG 1260 
GCCAGCTCGG AAACTGGAGA GCCAGTCCCC CTGCAAGCGC TGCTGCTCCT CCAGAACTGC 1320 
AGCCCCACTC AGAAAGCAGT GACCTTCTGT ACACGCCATC CTACAGCCTG CCCTTCTCCT 1380 
ACCATTACGG ACACTTCCCT CTGGACTCTC ACGTCTTCAG CAGC AAAAAG CCAATGTTGC 1440 
CGGCCAAGTT CGGGCAGCCC CAAGG ATCCC CTTGTGAGGT GGCACGCTTT TTCCTGAGCA 1500 
CACTGCCAGC CAGCGGTGAA TGCCAGTGGC ATTATGCCAA CCCCCTAGTG CCTAGCAGCT 1560 
CGTCTCCAGC TAA AAATCCT CCAGAGCCAC CGGCG AACAC TGCTAGGCAC AGCCTGGTGC 1 620 
CAAGCTACGA AGCGCCCGCC GCCGCCGTGC GCAGGTTCGG CGAGGACACC GCGCCCCCGA 1680 
GCTTCCCGAG CTGCGGCCAC TACCGCGAGG AGCCCGCGCT GGGCCCGGCC AAAGCCGCCC 1740 
GCCAGGCCGC CCGGGACGGG GCGCGGCTGG CGCTGGCCCG CGCGGCACCC GAGTGCTGCG 1800 
CGCCCCCGAC CCCCGAGGCC CCGGGCGCGC CGGCGCAGCT GCCCTTCGTG CTGCTCAACT 1 860 
ACCACCGCGT GCTGGCCCGG CGCGGACCGC TGGGGGGCGC CGCACCCGCC GCCTCCGGCC 1920 
TGGCCTGCGC TCCCGGCGGC CCCGAGGCGG CGACCGGCGC GCTGCGGCTC CGGCACCCGA 1980 
GCCCCGCCGC CACCTCCCCG CCCGGCGCGC CCCTGCCGCA CTACCTGGGC GCCTCGGTCA 2040 
TCATCACCAA CGGGAGGIQA CCCGCTGGCC GCCCGCGCCA CGAGCCTGGA CCCGGCCTCC 2100 
CGGGGCTGCG GCGCCACCGA GCCCGGCAAA TGCGCACGAC CTACATTAAT TTATGCAGAG 2160 
ACAGCTGTTT GAATTGGACC CCGCCGCCG A CTTGCGGATT TCCACCGCGG AGGCCCCGCG 2220 
CGCCGGTGCC GAGGGCCGAG G AGCGCCCGG GTCCGGGCAG GTGACCGCCC GCCTCTGTCC 2280 
TGCGAGGGCC GGTGCGACCC AGTTGCTGGG GGCTTGGTTT CCTCACCTTG AAATCGGGCT 2340 
TCACGCGTCT TGCCTTGTCC CCAACGTTCC ACAACAGTCC CGCTGGGGGA TTGAAGCGGT 2400 
TTCACTCCGC AAATATCCTC CACTTTCAGG AGGG AAAACC CACCCTACCA CAGTCCGCTC 2460 
TTCCAAGTGG ACGGCAGACC TGGGAGGGGA CGCCTGTGTC ACGAGCCCTT TTAGATGCTT 2520 
AGGTGAAGGC AG AAGTGATG ATTGTAAGTC CCATGAATAC ACAACTCCAC TGTCTTTAAA 2580 
AGTCATTCAA G AGTCTCATT ATTTTTGTTT TTATTTAACC CTTTCTTCAA TACAAAAAGC 2640 
CAACAAACCA AGACTAAGGG GGTGACCATG CAATTCCATT TTGTGTCTGT GAACATAGGT 2700 
GTGCTTCCCA AATACATTAA CAAGCTCTTA CTTCCCCCTA ACCCCTATGA ACTCTTGATA 2760 
ACACCAAGAG TAGCACCTTC AG AATATATT GAATAGGCAT TAAATGCAAA AATATATATG 2820 
TAGCCAGACA GTTTATGAGA ATGACCCTGT CAAGCTTCAT TATTACGTGG CAAAATCCCT 2880 
CTGGCCCACA CAGATCTGTA ATTCACTAGG CTCGTGTTTG CTACAAATAG TGCTAATAAA 2940 
GTTAAATTGC ACGTGCAATA CGGAACACTG TCAATGGACT GCACCTTGTG AAGG AAAAAC 3000 
ATGCTTAAGG GGGTGTAATG AAAATGATGT AGACATTTTA AGCATTTTCT ACACAGCG AG 3060 
AAAACTTCGT AAGAACATGT TACGTGTGCA ACAGGTAAAC AGAAATCCTT TCATAAAGCA 3120 
CCAGCAGTGT TTAAAAAATG AGCTTCCATT AA l I 1 1 1 ACT TTTTATGGGT TTTGCTTAAA 3 180 
GATCTCAACA TGGAAAAATC CTGTCATGGC TCTG AACTGC AC AATGCATT GAACCGCCGT 3240 
CCTTCAATTT TCTTCAC ACT ATCAACACTG CAGCATTTTG CTGCTTTATC AAAATGGTTT 3300 
ATTTTAGG AA ACTTTTTCCA CCTTTCTGAA TGGAAAGAGG TTTTCACAAA TGTTTTAAAC 3360 
TCATCGTTCT AAAATCAAGT GCACCTACAC CAACTGCTCT CAAAATGTGA ACTGACTTTT 3420 
TTTTT TT TTI TTTTGCC AAC CCTGTGTCAC TTAGTGAGGA CCTGACACAA TCCCTACAGG 3480 
GTGTCTGTC A GTGGGCCTCA TGGTAAGAGT CACAATTTGC AAATTTAGG A CCGTGGCTCA 3540 
TGCAGCGAAG GGGCTGGATG GTAGGAAGGG ATGTGCCCGC CTCTCCACCC ACTCAGCTAT 3600 
ACCTCATTCA CAGCTCCTTG TGAGTGTGTG CACAGGAAAT AAGCCGAGGG TATTATTTTT 3660 
TTATGTTCAT GAGTCTTGTA ATTAAACCGT GATTCTTGAA AGGTGTAGGT TTGATTACTA 3720 
GGAG ATACCA CCGACATTTT TCA ATAAAGT ACTGCAAAAT GCTTTTGTGT CTACCTTGTT 3780 
ATTAACTTTT GGGGCTGTAT TTAGTAAAAA TAAATCAAGG CTATCGGAGC AGTTCAATAA 3840 
CAAAGGTTAC TGTTG AG AAA AAAG ACCCTA TCATAG ATTT ACAAG 



SEQ ID N0:11 0 PFJB Proleln sequence: 
Protein Accession #: NPJJ05060.1 

I 11 21 31 41 51 
I I I I I I 

MKEKSKNAAK TRREKENGEF YELAKLLPLP SAJTSQLDKA SlIRLTTSYL KMRAVFPEGL 60 
GDAWGQPSRA GPLDGVAKEL GSHLLQTLDG FVFWASDGK 1MYISETASV HLGLSQVELT 120 
GNSIYEY1HP SDHDEMTA VL TAHQPLHHHL LQEYEIERSF FLRMKCV LAK RKAGLTCSG Y 1 80 
KVIHCSGYLX ERQYMLDMSL YDSCYQIVGL VAVGQSLPPS AITEIKLYSN MFMFRASLDL 240 

341 
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KUFLDSRVT EVTGYEPQDL EKTLYHHVH GCDVFHLRYA HHLLLVKGQV TTKYYRLLSK 300 
RGGWVWVQSY ATWHNSRSS RPHCIVSVNY VLTEIEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQETRKLV KPKNTKMKTK LRTNPYPPQQ YSSFQMDKLE CGQLGNWRAS PPASAAAPPE 420 
LQPHSESSDL LYTPSYSLPF SYHYGHFPLD SHVFSSKKPM LPAKFGQPQG SPCEVARFFL 480 
STLPASGECQ WHYANPLVPS SSSPAKNPPE PPANTARHSL VPSYEAPAAA VRRPGEDTAP 540 
PSFPSCGHYR EEPALGPAKA ARQAARDGAR LALARAAPEC CAPPTPEAPG APAQLPFVLL 600 
NYHRVLARRG PLGGAAPAAS GLACAPGGPE AATGALRLRH PSPAATSPPG APLPHYLGAS 660 
VI1TNGR 



SEQ ID N0:111 PFJ7 DNA SEQUENCE 

Nucleic Add Accession*: KM.006M9 

Coding sequence: 1-1254 {undefined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

ATGAACGGAC GCTGCATCTG CCCGTCCCTG CCCTACTCAC CCGTCAGCTC CCCGCAGTCC 60 
TCGCCTCGCC TGCCCCGGCG GCCGACAGTG GAGTCTCACC ACGTCTCCAT CACGGGTATG 120 
CAGGACTGTG TGCAGCTGAA TCAGTATACC CTGAAGGATG AAATTGGAAA GGGCTCCTAT 180 
GGTGTCGTCA AGTTGGCCTA CAATGAAAAT GACAATACCT ACTATGCAAT GAAGGTGCTG 240 
TCCAAAAAGA AGCTGATCCG GCAGGCCGGC TTTCCACGTC GCCCTCCACC CCGAGGCACC 300 
CGGCCAGCTC CTGGAGGCTG CATCCAGCCC AGGGGCCCCA TTGAGCAGGT GTACCAGGAA 360 
ATTCCCATCC TCAAGAAGCT GGACCACCCC AATGTGGTGA AGCTGGTGGA GGTCCTGG AT 420 
GACCCCAATG AGGACCATCT GTACATGGTG TTCGAACTGG TCAACCAAGG GCCCGTGATG 480 
GAAGTGCCCA CCCTCAAACC ACTCTCTGAA G ACCAGGCCC GTTTCTACTT CCAGGATCTG 540 
ATCAAAGGCA TCGAGTACTT ACACTACCAG AAGATCATCC ACCGTG ACAT CAAACCTTCC 600 
AACCTCCTGG TCGGAG AAGA TGGGCACATC AAGATCGCTG ACTTTGGTGT GAGCAATGAA 660 
TTCAAGGGCA GTGACGCGCT CCTCTCCAAC ACCGTGGGCA CGCCCGCCTT CATGGCACCC 720 
GAGTCGCTCT CTG AGACCCG CAAGATCTTC TCTGGGAAGG CCTTGGATGT TTGGGCCATG 780 
GGTGTGACAC TATACTGCTT TGTCTTTGGC CAGTGCCCAT TCATGGACGA GCGGATCATG 840 
TGTTTACACA GTAAGATCAA GAGTCAGGCC CTGG AATTTC CAGACCAGCC CGACATAGCT 900 
GAGGACTTGA AGG ACCTGAT CACCCGTATG CTGG ACAAG A ACCCCGAGTC GAGGATCGTG 960 
GTGCCGGAAA TCAAGCTGCA CCCCTGGGTC ACGAGGCATG GGGCGGAGCC GTTGCCGTCG 1020 
GAGGATGAGA ACTGCACGCT GGTCGAAGTG ACTGAAGAGG AGGTCGAGAA CTCAGTCAAA 1080 
CACATTCCXA GCTTGGCAAC CGTGATCCTG GTGAAGACCA TGATACGTAA ACGCTCCTTT 1 140 
GGGAACCCAT TCGAGGGCAG CCGGCGGGAG GAACGCTCAC TGTCAGCGCC TGGAAACTTG 1200 
CTCACCAAAA AACCAACCAG GGAATGTGAG TCCCTGTCTG AGCTCAAGAC CTAGAAAATA 1260 
AGTCCCCTTC CTGCCTGTTG CAA AGTAACG TA AG AGTTCC CTCACCCC AG TGGATGCAG A 1 320 
CGTTCTTGCT GTCAGCCACC TTCCTTCATA CACATAGCCA GCCCAGGGTG ACCAGAACGT 1380 
CCCAGGACAG ATGAGGCTTT GTGTCCTTAT G AGAGTGGG A GAACCTGGTG GGCACCCCTG 1440 
GTGCAGGTGC TGTGGTGGGT GGGGACCCCA CTGCCTTTCC CACTGAGCAC ATCATGGCTA 1500 
CCTGACTTGG TGGG AGTTCC ATTCAGTCAC TTCTGTTTCT TAAACATAGC TTTACTGAGG 1560 
TACAATTCAC ATACCATGTA ATTCACCCAC GGGAAGTGTA TGATTCAGTG GTTTCTAATA 1620 
CACACTTCTG CAGCCATTAC CACCGTCAAC TTTACGACAT TTTCATCAGC CCAAGAAGAC 1 680 
ACCCTACACT CCTTAGCTGT CCCCATCCAA CTCCCCCACC CCAGTAACCA CTCAGAATAG 1740 
GTATGGATTT GCCTATTCTG GACGTTTCGT ATAAATGGCG TCATACACTA AAAAAAA AAA 1 800 
AAAA 



?EQ fp HQ:112 PFJ7 Proton ggqugnce; 
Protein Accession #: NP_00654<L1 

1 II 21 31 41 51 
I I I 1 I I 

MNGRCICPSL PYSPVSSPQS SPRLPRRPTV ESHHVS1TGM QDCVQLNQYT LKDEIGKGSY 60 
GWKJLAYNEN DNTYYAMKVL SKKKURQAG FPRRPPPRGT RPAPGGCIQP RGPEQVYQE 120 
1AELKKLDHP NVVKLVEVLD DPNEDHLYMV FELVNQGPVM EVPTLKPLSE DQARFYFQDL 180 
KGIEYLHYQ K11HRDIKPS NLLVGEDGH1 K1ADFGVSNE FKGSDAULSN TVGTPAFMAP 240 
ESLSETRKIF SGKALDVWAM GVTLYCFVFG QCPFMDERIM CLHSK1KSQA LEFPDQPDIA 300 
EDLKDUTRM LDKNPESRJV VPEKLHPWV TRHGAEPLPS EDENCTLVEV TEEEVENSVK 360 
HIPSLATVIL VKTMIRKRSF GNPFEGSRRE ERSLSAPGNL LTKKPTRECE SLSELKT 



SEQ 10 Nai 13 PFJ6 DNA SEQUENCE 

Nucleic Add Accession I: NM.021810 

Coding sequence: 1-429 (underlined sequences correspond to start and stop codons) 

I 11 21 31 41 51 
I I I I I I 

ATG AAACCTC TGATATGGAC ATGGTCAG AT GTTGAAGGCC AGAGGCCGGC TCTGCTCATC 60 
TGCACAGCTG CAGCAGG ACC CACGCAGGG A GTTAAGGGTT ATGGCAAGCC CTTTGAGCCA 120 
AGAAGTGTGA AAAACATACA CTCTACTCCT GCTTACCCAG ATGCCACAAT GCACAGACAA 180 
CTCCTGGCTC CGGTGG AAGG AAGGATGGCA GAGACATTGA ATCAGAAACT CCATGTTGCC 240 
AATGTGCTGG AAGATGACCC CGGCT ACCTA CCTCACGTCT ACAGCGAGGA AGGGGAGTGT 300 
GGAGGGGCCC CATCCCTCAG CTCTCTGGCC AGCTTGG AAC AGGAGTTGCA ACCTGATTTG 360 
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CTGGACTCTT TGGGTTCAAA AGCGACTCCG TTTGAGGAAA TATATTCAG A GTCAGGTGTT 420 
CCTTCCTAA 



5 SEQ ID N0:114 PFJ6 Protein sequence: 
Protein Accession*: NP__068582.1 

1 U 21 31 4! 51 

1 0 MKPUWTWSD VEGQRPALLI CTAA AGPTQG VKGYGKPFEP RS VKNIHSTP A YPDATMHRQ 60 
LLAPVEGRMA ETLNQKLHVA NVLEDDPGYL PHVYSEEGEC GGAPSLSSLA SLEQELQPDL 120 
LDSLGSKATP FEEIYSESGV PS 



15 

SEQ ID NO:115 PFJ5 DNA SEQUENCE 

Nucleic Acid Accession*: NM_0QS361 

Cooing sequence: 131-985 (underlined sequences correspond to start and stop codons) 

20 1 11 21 31 41 51 
I I I I I 1 

CGAATGCAGG CGACTTGCGA GCTGGGAGCG ATTTAAAACG CTTTGGATTC CCCCGGCCTG 60 
GGTGGGGAGA GCG AGCTGGG TGCCCCCTAG ATTCCCCGCC CCCGCACCTC ATGAGCCG AC 1 20 
CCTCGGCTCC ATG_GAGCCCG GCA ATTATGC CACCTTGGAT GGAGCCAAGG ATATCGAAGG 1 80 

25 CTTGCTGGGA GCGGGAGGGG GGCGGAATCT GGTCGCCCAC TCCCCTCTGA CCAGCCACCC 240 
AGCGGCGCCT ACGCTGATGC CTGGTGTCAA CTATGCCCCC TTGGATCTGC CAGGCTCGGC 300 
GGAGCCGCCA AAGCA ATGCC ACCCATGCCC TGGGGTGCCC CAGGGGACGT CCCCAGCTCC 360 
CGTGCCTTAT GGTTACTTTG GAGGCGGGTA CTACTCCTGC CGAGTGTCCC GGAGCTCGCT 420 
. G AAACCCTGT GCCCAGGCAG CCACCCTGGC CGCGTACCCC GCGGAGACTC CCACGGCCGG 480 

30 GG AAGAGTAC CCCAGTCGCC CCACTGAGTT TGCCTTCTAT CCGGGATATC CGGGAACCTA 540 
CCACGCTATG GCCAGTTACC TGGACGTGTC TGTGGTGCAG ACTCTGGGTG CTCCTGGAGA 600 
ACCGCGACAT GACTCCCTGT TGCCTGTGGA CAGTTACCAG TCTTGGGCTC TCGCTGGTGG 660 
CTGGA ACAGC CAGATGTGTT GCCAGGGAGA ACAGAACCCA CCAGGTCCCT TTTGGAAGGC 720 
AGCATTTGCA G ACTCCAGCG GGCAGCACCC TCCTG ACGCC TGCGCCTTTC GTCGCGGCCG 780 

35 CAAGAAACGC ATTCCGTACA GCAAGGGGCA GTTGCGGGAG CTGGAGCGGG AGTATGCGGC 840 
TAACAAGTTC ATCACCAAGG ACAAGAGGCG CAAGATCTCG GCAGCCACCA GCCTCTCGGA 900 
GCGCCAGATT ACCATCTGGT TTCAGAACCG CCGGGTCAAA GAGAAG AAGG TTCTCGCCAA 960 
GGTGAAGAAC AGCGCTACCC CTTAAGAGAT CTCCTTGCCT GGGTGGGAGG AGCGAAAGTG 1020 
GGGGTGTCCT GGGGAGACCA GAAACCTGCC AAGCCCAGGC TGGGGCCAAG GACTCTGCTG 1080 

40 AGAGGCCCCT AG AG ACAACA CCCTTCCCAG GCCACTGGCT GCTGG ACTGT TCCTCAGG AG 1 1 40 
CGGCCTGGGT ACCCAGTATG TGCAGGGAGA CGGAACCCCA TGTGACAGGC CCACTCCACC 1200 
AGGGTTCCCA AAGAACCTGG CCCAGTCATA ATCATTCATC CTCACAGTGG CAATAATCAC 1260 
GATAACCAGT 

45 

SEQ ID N0:116 PfJS protein sequence: 
Protein Accession #: NP_006352.1 

50 1 11 21 31 41 51 

MEPGN YATLD GAKDIEGLLG AGGGRNLVAH SPLTSHPAAP TLMPAVNYAP LDLPGS AEPP 60 
KQCHPCPGVP QGTSPAPVPY GYFGGGYYSC RVSRSSLKPC AQAATLAAYP AETPTAGEEY 120 
PSRPTEFAFY PGYPGTYHAM ASYLDVSVVQ TLGAPGEPRH DSLLPVDSYQ SWALAGGWNS 180 
55 QMCCQGEQNP PGPFWKAAFA DSSGQHPPDA CAFRRGRKKR IPYSKGQLRE LEREYAANKF 240 
1TKDKRRKIS AATSLSERQI TTWFQNRRVK EKKVLAKVKN SATP 



60 SEQ ID NO:117 PFJ4 DNA SEQUENCE 
Nucleic Add Accession #: NM_005628 

Coding sequence: 591-2216 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

65 | | | | | | 

GTAACCGCTA CTCCCGGACA CCAGACCACC GCCTTCCGTA CACAGGGGCC CGCATCCCAC 60 
CCTCCCGGAC CTAAG AGCCT GGGTCCCCTG TTTCCGGAGG TCCGCTTCCC GGCCCCCAGA 120 
TTCTGGCATC CCAGCCCTCA GTGTCCAAGA CCCAGGCAGC CCGGGTCCCC GCCTCCCGGA 180 
T(X AGGCGTC CGGG ATCTGC GCCACCAG AA CCTAGCCTCC TGCAG ACCTC CGCCATCTGG 240 

70 GGGCACTCAA CCTCCTGGAG CCAAGGGCCC CACGTCCCAC CCAGAGAAAC TCTCGTATTC 300 
CCAGCTCCTA GGGCCAAGGA ACCCGGGCGC TCCGAACTCC CAGCTTTCGG ACATCTGGCA 360 
CACGGGGCAG AGCAGAGAAG CTCAGCGCCC AGCCTGGGGA A1TTAAACAC TCCAGCTTCC 420 
AAGAGCCA AG G A ACTTC AGT GCTGTGAACT CACAACTCTA AGGAGCCCTC CAAAGTTCCA 480 
GTCTCCAGGT GCTGTTACTC AACTCAGTCC TAGGAACGTC GGGTCCTGGG AAGGAGCCCA 540 

75 AGCGCTCCCA GCCAGCTTCC AGGCGCTAAG AAACCCCGGT GCTTCCCATC ATGGTGGCCG 600 
ATCCTCCTCG AGACTCCAAG GGGCTCGGAG CGGCGGAGCC CACCGCCAAC GGGGGCCTGG 660 
CGCTGGCCTC CATCGAGGAC CAAGGCCCGG CAGCAGGCGG CTACTGCGGT TCCCGGGACC 720 
AGGTGCGCCG CTGCCTTCGA GCCAACCTGC TTGTGCTGCT GACAGTGGTG GCCGTGGTGG 780 
CCGGCGTGGC GCTGGGACTG GGGGTGTCGG GGGCCGGGGG TGCGCTGGCG TTGGGCCCGG 840 
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AGCGCTTGAG CCCCTTCGTC 7TCCCGGGCG AGCTGCTGCT GCGTCTGCTG CGGATGATCA 900 
TCTTGCCGCT GGTGGTGTGC AGCTTGATCG GCGGCGCCGC CAGCCTCGAC CCCGCCCCCC 960 
TCGCCCGTCT GGGCGCCTGG GCGCTGCTCT TTTTCCTGGT CACCACGCTG CTGGCGTCGG 1020 
CGCTCGGAGT GGGCTTGGCG CTGGCTCTGC AGCCGGGCGC CGCCTCCGCC GCCATCAACG 1080 
CCTCCGTGGG AGCCGCGGGC AGTGCCGAAA ATGCCCCCAG CAAGGAGGTG CTCGATTCGT 1 140 
TCCTGGATCT TGCGAGAAAT ATCTTCCCTT CCAACCTGGT GTCAGCAGCC TTTCGCTCAT 1200 
ACTCTACCAC CTATGAAGAG AGGAATATCA CCGGAACCAG GGTGAAGGTG CCCGTGGGGC 1260 
AGGAGGTGG A GGGGATGAAC ATCCTGGGCT TGGTAGTGTT TGCCATCGTC TTTGGTGTGG 1 320 
CGCTGCGGAA GCTGGGGCCT GAAGGGGAGC TGCTTATCCG CTTCTTCAAC TCCTTCAATG 1380 
AGGCCACCAT GGTTCTGGTC TCCTGGATCA TGTGGTACGC CCCTGTGGGC ATCATGTTCC 1440 
TGGTGGCTGG CAAGATCGTG GAGATGGAGG ATGTGGGTTT ACTCTTTGCC CGCCTTGGCA 1500 
AGTACATTCT GTGCTGCCTG CTGGGTCACG CCATCCATGG GCTCCTGGTA CTGCCCCTCA 1560 
TCTACTTCCT CTTCACCCGC A AAAACCCCT ACCGCTTCCT GTGGGGCATC GTGACGCCGC 1620 
TGGCCACTGC CTTTGGGACC TCTTCCAGTT CCGCCACGCT GCCGCTGATG ATG AAGTGCG 1 680 
TGGAGGAGAA TAATGGCGTG GCCAAGCACA TCAGCCGTTT CATCCTGCCC ATCGGCGCCA 1740 
CCGTCAACAT GGACGGTGCC GCGCTCTTCC AGTGCGTGGC CGCAGTGTTC ATTGCACAGC 1800 
TCAGCCAGCA GTCCTTGGAC TTCGTAAAGA TCATCACCAT CCTGGTCACG GCCACAGCGT 1860 
CCAGCGTGGG GGCAGCGGGC ATCCCTGCTG GAGGTGTCCT CACTCTGGCC ATCATCCTCG 1920 
AAGCAGTCAA CCTCCCGGTC GACCATATCT CCTTG ATCCT GGCTGTGGAC TGGCTAGTCG 1980 
ACCGGTCCTG TACCGTCCTC AATGTAGAAG GTGACGCTCT GGGGGCAGGA CTCCTCCAAA 2040 
ATTATGTGGA CCGTACGGAG TCGAGAAGCA CAGAGCCTGA GTTGATACAA GTGAAGAGTG 2100 
AGCTGCCCCT GGATCCGCTG CCAGTCCCCA CTGAGGAAGG AAACCCCCTC CTCAAACACT 2160 
ATCGGGGGCC CGCAGGGGAT GCCACGGTCG CCTCTGAGAA GG AATCAGTC ATGTAAACCC 2220 
CGGGAGGGAC CTTCCCTGCC CTGCTGGGGG TGCTCTTTGG ACACTGGATT ATG AGGAATG 2280 
GATAAATGGA TGAGCTAGGG CTCTGGGGGT CTGCCTGCAC ACTCTGGGG A GCCAGGGGCC 2340 
CCAGCACCCT CCAGGACAGG AGATCTGGGA TGCCTGGCTG CTGGAGTACA TGTGTTCACA 2400 
AGGGTTACTC CTCAAAACCC CCAGTTCTCA CTCATGTCCC CAACTCAAGG CTAGAAAACA 2460 
GCA AG ATGGA GAAATAATGT TCTGCTGCGT CCCCACCGTG ACCTGCCTGG CCTCCCCTGT 2520 
CTCAGGGAGC AGGTCACAGG TCACCATGGG G AATTCTAGC CCCCACTGGG GGGATGTTAC 2580 
AACACCATGC TGGTTATTTT GGCGGCTGTA GTTGTGGGGG GATGTGTGTG TGCACGTGTG 2640 
TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG TTCTGTG ACC TCCTGTCCCC ATGGTACGTC 2700 
CCACCCTGTC CCCAGATCCC CTATTCCCTC CACAATAACA G AAACACTCC CAGGG ACTCT 2760 
GGGG AGAGGC TGAGGACAAA TACCTGCTGT CACTCCAGAG GACATTTTTT TTAGCAATAA 2820 
AATTGAGTGT CAACTATTTA AAAAAAAAAA AAAAAA 



SEQ IDMQ:118 PFJ4 Protein sequence: 
Protein Accession #; NPJJ05619.1 

1 11 21 31 41 51 
I I I I I 1 

MVADPPRDSK GLAAAEPTAN GGLALASED QGAAAGGYCG SRDQVRRCLR ANLLVLLTVV 60 
AWAGVALGL GVSGAGGALA LG PERLS AFV FPGELLLR1X RMIILPLWC SLIGGAASLD 120 
PGALGRLGAW ALLFFLVTTL LASALGVGLA LALQPGAASA AJNASVGAAG SAENAPSKEV 180 
LD5FLDLARN IFPSNLVSAA FRSYSTTYEE RNITGTRVKV PVGQEVEGMN ILGLWFAJV 240 
FGVAJLRKLGP EGE1XIRFFN SFNEATMVLV SWIMWYAPVG IMFLVAGKIV EMEDVGLLFA 300 
RLGKYILCCL LGHAIHGLLV LPUYFLFTR KNPYRFLWGI VTPLATAFGT SSSSATLPLM 360 
MKCVEENNGV AKHISRFILP 1GATVNMDGA ALFQCVAAVF 1AQLSQQSLD FVKimLVT 420 
ATASSVGAAG 1PAGGVLTLA IILEAVNLPV DH1SLILAVD WLVDRSCTVL NVEGDALGAG 480 
LLQNYVDRTE SRSTEPEUQ VKSELPLDPL PVPTEEGNPL LKHYRGPAGD ATVASEKESV 540 
M 



SEQ ID N0:119 PFJ3 DNA SEQUENCE 

Nucleic Acid Accession #: NM.006708 

Coding sequence: 88-642 (underlined sequences correspond to start and slop codons) 

I 11 21 31 41 51 
I I I I I I 

CTAGTTAAGG CGGCACAGGG CCGAGGCGTA GTGTGGGTGA CTCCTCCGTT CCTTGGGTCC 60 
CGTCGTCTGT GATACTGCAG TTCAGCCATQ GCAGAACCGC AGCCCCCGTC CGGCGGCCTC 120 
ACGGACGAGG CCGCCCTCAG TTGCTGCTCC GACGCGGACC CCAGTACCAA GGATTTTCTA 180 
TTGCAGCAGA CCATGCTACG AGTGAAGGAT CCTAAGAAGT CACTGGATTT TTATACTAGA 240 
GTTCTTGGAA TGACGCTAAT CCAAAAATGT GATTTTCCCA TTATGAAGTT TTCACTCTAC 300 
TTCTTGGCTT ATGAGGATAA AAATGACATC CCTA AAGAAA AAGATGAAAA AATAGCCTGG 360 
GCGCTCTCCA G AAAAGCTAC ACTTG AGCTG ACAC ACAATT GGGGCACTGA AGATG ATGCG 420 
ACCCAGAGTT ACCACAATGG CAATTCAG AC CCTCGAGGAT TCGGTCATAT TGGAATTGCT 480 
GTTCCTGATG TATACAGTGC TTGTAAAAGG TTTGAAGAAC TGGGAGTCAA ATTTGTGAAG 540 
AAACCTGATG ATGGTAAAAT GAAAGGCCTG GCATTTATTC AAGATCCTGA TGGCTACTGG 600 
ATTGA AATTT TG AATCCTAA CAAAATGGCA ACCTTAAT GT AGT GCTGTGA GAATTCTCCT 660 
TTGAGATTTC AGAAG AAAGG AAACAATGTG ATTCAAGATA TTTACATACC AGAAGCATCT 720 
AGGACTGATG GATCACTGTC CCGATTCAAA TTATTCTTCA GTCCATTTCC CCTTCCTATT 780 
TCAGCTGTTC CTTTTCACCT AACTGTTCAG TCATTCTGGT TTTCAAGC AG TGCTTTATCT 840 
CATGTCCTTG AATATAGTTG TGTAACTTTA TTTTTTAGGT AATAATTAGA ACAGTTCCCT 900 
TCAGAGGCTG CATTTGCCTT CTTCTGCCAC CTAAATATTA CTTCCCTTCA AATCTGCCTT 960 
TGAATCATCA TTTTTAAAAA AAAATTAACA TGTTTTTGTT GTAGTTATCT TCTGGGGTTT 1020 
CAATTCCTCA GAAACAACTT TTTTCACAAC GGAAAGG AAA GAACACTAGT GTTCTTTCAG 1080 
TAAAGTACAA A G f G 11 T A TT TTACAAAAGA GTAGGTACTC TTGAGAGCAA TTCAAATCAT 1 140 
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GCTGACAAGG ATACTGATAG AAAAAGTGAT TTCTTCTTAT TATAAAGTAC ATTTAAAGTT 1200 
CAAGGACTAA CCTTATTTAT TTGGGAAAGG GGAGGAGGAA GGAAATG ATA TGGTACCCAG 1260 
ACACTGGGCT AGGCTGCAAC TTTATCTCAT TTAATACTCC CAGCTGTCAT GTGAGAAAGA 1320 
AAGCAGGCTA GGCATGTGAA ATCACTTTCA TGGATTATTA ATGGATTTAA GAGGGCATCA 1380 
J ATCAGCTCAA CTCAAGATTT CATAATCATT TTTAGTATTT AGATTGTGCC TCAAAGTTGT 1440 
AGTACCTCAC AATACCTCCA CTGGTTTCCT GTTGTAAAAA CCTTCAGTGA GTTTGACCAT 1500 
TGTGCTCTTG GCTCTTGGGC TGGAGTACCG TGGTGAGGGA GTAAACACTA GAAGTCTTTA 1560 
GTACAAAACT GCTCTAGGGA CACCTGGTGA TTCCTACACA AGTGATGTTT ATATTTCTCA 1620 
TAAAGAGTCT TCCCTATCCC AAGGTCTTCA TGATGCCAGT AGCCATATAT GATAAATTAT 1680 

1 U GTTCAGTG AT AACTTAGTTA TCAGAA ATCA GCTCAGTGGT CTTCCCCGCC ATGATTCACA 1 740 
TTTGATGAGT TTTTAAAAAT CAAAGTGATT TTG AAAATCT CTAATGGCTC AG AAAATAAA 1 800 
AACATCCAGT TTGTGGATGA CTATATTTAG ATTTCTCTAG ACTCTAGTGG AAGACCTTTG 1860 
G AAAGGCCAT GCCA ACCGTG CTTGTACTGC TAG AAGCACT TTATGTTTCC TTTTTGGGTG 1920 
AAATGGATTT ATGTGAGTGC TTTAAACAAA TAGCAATACT TATAGACTGA AATAAAATGA 1980 

15 A ACTTCAAAT AAG 



20 



iO 



8 



SEQ ID NO:120 PFJ3 Protein sequence: 
Protein Accession #: NP.006699.1 



I 11 21 31 41 51 
I I I I I I 

MAEPQPPSGG LTDEAALSCC SDADPSTKDF LLQQTMLRVK DPKKSLDFYT RVLGMTUQK 60 
CDFPIMKFSL YFLAYEDKND 1PKEKDEK1A WALSRKATLE LTHNWGTEDD ATQSYHNGNS 120 
Id dprgpghigi a vpdvysack rfeelgvkfv KKPDDGKMKG LAHQDPDG Y WIEONPNKM 1 80 
ATLM 



50 SEQ 10 NO:121 PFJ2 DNA SEQUENCE 

Nuc^c Acid Accession!: NMJJ02867 

Coding sequence 70-729 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
55 | | | | | | 

CCGACGCCAG GTCCTGCCGT CCCGCCGACC GTCCGGGAGC GAACCCGTCG TCCCGCACTG 60 
GAGTCCGCG A TC G CTTCA GT GACAG ATGGT A AACATGG AG TCAAAG ATGC CTCTG ACCAG 1 20 
AATTTTGACT ACATGTTTAA ACTGCTTATC ATTGGCAACA GCAGTGTTGG CAAG ACCTCC 1 80 
TTCCTCTTGC GCTATGCTGA TGACACGTTC ACCCCAGCCT TCGTTAGCAC CGTGGGCATC 240 
IU GACTTCAAGG TGAAGACAGT CTACCGTCAC GAGAAGCGGG TGAAACTGCA GATCTGCGAC 300 
ACAGCTGGGC AGCAGCGGTA CCGGACCATC ACAACAGCCT ATTACCGTGG GGCCATGGGC 360 
TTCATTCTGA TGTATGACAT CACCAATGAA GAGTCCTTCA ATGCTGTCCA AGACTGGGCT 420 
ACTCAG ATCA AGACCTaCTC CTGGGACAAT GCACAAGTTA TTCTGGTGGG GAACAAGTGT 480 
GACATGG AGG AAGAGAGGGT TGTTCCCACT GAGAAGGGCC AGCTCCTTGC AGAGCAGCTT 540 
YD GGGTTTGATT TCTTTGAAGC CAGTGCAAAG G AGAACATCA GTGTAAGGCA GGCCTTTGAG 600 
CGCCTGGTGG ATGCCATTTG TG AC AAGATG TCTGATTCGC TGGACACAGA CCCGTCGATG 660 
CTGGGCTCCT CCAAGAACAC GCGTCTCTCG GACACCCCAC CGCTGCTGCA GCAGAACTGC 720 
TGATGCTAGC AAGGCCCACC TTCCTG ACCT CCCCTCATTG TGGCCCCACA CCCAAGTCTG 780 
CTTCTCCCTG TTACACACTG TCCGCTCT 



SEQ ID KChmpFJZPTO^ns^uence; 
Protein Accession*: NP_00265ai 



• 5 1 11 21 31 41 51 
I I I I I I 

MASVTDGKHG VKDASDQNFD YMFKLL1IGN SSVGKTSFLL RYADDTFTPA FVSTVGIDFK 60 
VKTVYRHEKR VKLQIWDTAG QERYRTITTA YYRGAMGFIL MYDITNEESF NAVQDWATQ1 120 
KTYSWDNAQV ILVGNKCDME EERWPTEKG QLLAEQLGFD FFEASAKENI SVRQAFERLV 1 80 
>U DA1CDKMSDS LDTDPSMLGS SKNTRLSDTP PIXQQNCSC 



SEQ ID NO:123 PFJ1 DNA SEQUENCE 

Nucleic Acid Accession f: NM.001844 

Coding sequence: t58-4621 (underlined sequences correspond to start and slop codons) 



1 II 21 31 41 51 
0 I I I I I I 

ACGCAG AGCG ctgctgggct GCCGGGTCTC CCGCTTCCTC CTCCTGCTCC AAGGGCCTCC 60 
TGCATG AGGG CGCGGTAGAG ACCCGGACCC GCGCCGTGCT CCTGCCGTTT CGCTGCGCTC 1 20 
CGCCCGGGCC CGGCTCAGCC AGGCCCCGCG GTGAGCCAJS ATTCGCCTCG GGGCTCCCCA 180 
GTCGCTGGTG CTGCTGACGC TGCTCGTCGC CGCTGTCCTT CGGTGTCAGG GCCAGGATGT 240 
J CCAGG AGGCT GGCAGCTGTG TGCAGGATGG GCAGAGGTAT AATGATAAGG ATGTGTGGAA 300 
GCCGGAGCCC TGCCGGATCT GTGTCTGTG A CACTGGGACT GTCCTCTGCG ACGACATAAT 360 
CTGTGAAGAC GTGAAAG ACT GCCTCAGCCC TGAGATCCCC TTCGGAGAGT GCTGCCCCAT 420 
CTGCCCAACT GACCTCGCCA CTGCCAGTGG GCAACCAGGA CCAAAGGGAC AG AAAGGAGA 480 
ACCTGG AGAC ATCAAGGATA TTGTAGGACC CAAAGGACCT CCTGGGCCTC AGGGACCTGC 540 
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AGGGGAACAA GGACCCAGAG GGGATCGTGG TGACAAAGGT GAAAAAGGTG CCCCTGCACC 600 
TCGTGGCAG A G ATGG AGAAC CTGGGACCCC TGGAAATCCT GGCCCCCCTG GTCCTCCCGG 660 
CCCCCCTGGT CCCCCTGGTC TTGGTGGA AA CTTTGCTGCC CAGATGGCTG GAGGATTTGA 720 
TGAAAAGGCT GGTGGCGCCC AGTTGGGAGT AATGCAAGGA CCAATGGGCC CCATGGGACC 780 
5 TCGAGGACCT CCAGGCCCTG CAGGTGCTCC TGGGCCTC AA GGATTTCAAG GCAATCCTGG 840 
TGAACCTGGT GAACCTGGTG TCTCTGGTCC CATGGGTCCC CGTGGTCCTC CTGGTCCCCC 900 
TGGAAAGCCT GGTGATGATG GTG AAGCTGG AAAACCTGGA AAAGCTGGTG AAAGGGGTCC 960 
GCCTGGTCCT CAGGGTGCTC GTGGTTTCCC AGGAACCCCA GGCCTTCCTG GTGTCAAAGG 1020 
TCACAGAGGT TATCC AGGCC TGG ACGGTGC TA AGGGAGAG GCGGGTGCTC CTGGTGTG AA 1 080 
1 0 GGGTGAGAGT GGTTCCCCGG GTGAGAACGG ATCTCCGGGC CCAATGGGTC CTCGTGGCCT 1 140 
GCCTGGTGAA AGAGGACGGA CTGGCCCTGC TGGCGCTGCG GGTGCCCGAG GCAACGATGG 1200 
TCAGCCAGGC CCCGCAGGTC CTCCGGGTCC TGTCGGTCCT GCTGGTGGTC CTGGCTTCCC 1260 
TGGTGCTCCT GGAGCCAAGG GTGAAGCCGG CCCCACTGGT GCCCGTGGTC CTGAAGGTGC 1320 
TCAAGGTCCT CGCGGTG AAC CTGGTACTCC TGGGTCCCCT GGGGCTGCTG GTGCCTCCGG 1 380 
1 5 TAACCCTGGA ACAGATGGAA TTCCTGGAGC CAAAGGATCT GCTGGTGCTC CTGGCATTGC 1440 
TGGTGCTCCT GGCTTCCCTG GGCCACGGGG TCCTCCTGGC CCTCAAGGTG CAACTGGTCC 1500 
TCTGGGCCCG AAAGGTCAGA CGGGTGAACC TGGTATTGCT GGCTTCAAAG GTGAACAAGG 1560 
CCCCAAGGGA GAACCTGGCC CTGCTGGCCC CCAGGGAGCC CCTGGACCCG CTGGTGAAGA 1620 
AGGCAAGAGA GGTGCCCGTG GAGAGCCTGG TGGCGTTGGG CCCATCGGTC CCCCTGGAGA 1680 
20 AAGAGGTGCT CCCGGAAACC GCGGTTTCCC AGGTCAAGAT GGTCTGGCAG GTCCCAAGGG 1740 
AGCCCCTGGA GAGCGAGGGC CCAGTGGTCT TGCTGGCCCC AAGGGAGCCA ACGGTGACCC 1800 
TGGCCGTCCT GGAGAACCTG GCCTTCCTGG AGCCCGGGGT CTCACTGGCC GCCCTGGTGA 1860 
TGCTGGTCCT CAAGGCAAAG TTGGCCCTTC TGGAGCCCCT GGTGAAGATG GTCGTCCTGG 1920 
ACCTCCAGGT CCTCAGGGGG CTCGTGGGCA GCCTGGTGTC ATGGGTTTCC CTGGCCCCAA 1980 
25 AGGTGCCAAC GGTGAGCCTG GCAAAGCTGG TGAGAAGGGA CTGCCTGGTG CTCCTGGTCT 2040 
GAGGGGTCTT CCTGGCAAAG ATGGTGAGAC AGGTGCTGCA GGACCCCCTG GCCCTGCTGG 2100 
ACCTGCTGGT CAACCAGGCG AGCAGGGTGC TCCTGGGCCA TCTGGGTTCC AGGGACTTCC 2160 
TGGCCCTCCT GGTCCCCCAG GTGAAGGTGG AAAACCAGGT GACCAGGGTG TTCCCGGTGA 2220 
AGCTGGAGCC CCTGGCCTCG TGGGTCCCAG GGGTGAACGA GGTTTCCCAG GTGAACGTGG 2280 
30 CTCTCCCGGT GCCCAGGGCC TCCAGGGTCC CCGTGGCCTC CCCGGCACTC CTGGCACTGA 2340 
TGGTCCCAAA GGTGCATCTG GCCCAGCAGG CCCCCCTGGC GCACAGGGCC CTCCAGGTCT 2400 
TCAGGGAATG CCTGGCGAGA GGGGAGCAGC TGGTATCGCT GGGCCCAAAG GCG ACAGGGG 2460 
TGACGTTGGT GAG A A AGGCC CTGAGGGAGC CCCTGGAAAG GATGGTGGAC GAGGCCTGAC 2520 
AGGTCCCATT GGCCCCCCTG GCCCAGCTGG TGCTAACGGC GAGAAGGGAG AAGTTGGACC 2580 
35 TCCTGGTCCT GCAGGAAGTG CTGGTGCTCG TGGCGCTCCG GGTGAACGTG GAGAGACTGG 2640 
CCCCCCCGGA CCAGCGGG AT TTGCTGGGCC TCCTGGTGCT GATGGCCAGC CTGGGGCCAA 2700 
GGGTGAGCAA GGAGAGGCCG GCCAGA AAGG CGATGCTGGT GCCCCTGGTC CTCAGGGCCC 2760 
CTCTGGAGCA CCTGGGCCTC AGGGTCCTAC TGGAGTGACT GGTCCTAAAG GAGCCCGAGG 2820 
TGCCCAAGGC CCCCCGGGAG CCACTGG ATT CCCTGGAGCT GCTGGCCGCG TTGGACCCCC 2880 
40 AGGCTCCAAT GGCAACCCTG G ACCCCCTGG TCCCCCTGGT CCTTCTGGAA AAGATGGTCC 2940 
CAAAGGTGCT CGAGG AGACA GCGGCCCCCC TGGCCGAGCT GGTGAACCCG GCCTCCAAGG 3000 
TCCTGCTGGA CCCCCTGGCG AGAAGGGAGA GCCTGGAGAT GACGGTCCCT CTGGTGCCGA 3060 
AGGTCCACCA GGTCCCCAGG GTCTGGCTGG TCAGAGAGGC ATCGTCGGTC TGCCTGGGCA 3120 
ACGTGGTGAG AGAGGATTCC CTGGCTTGCC TGGCCCATCG GGTGAGCCCG GCAAGCAGGG 3180 
45 TGCTCCTGGA GCATCTGGAG ACAGAGGTCC TCCTGGCCCC GTGGGTCCTC CTGGCCTGAC 3240 
GGGTCCTGCA GGTGAACCCG GACGAGAGGG AAGCCCCGGT GCTGATGGCC CCCCTGGCAG 3300 
AGATGGCGCT GCTGGAGTCA AGGGTG ATCG TGGTGAGACT GGTGCTGTGG G AGCTCCTGG 3360 
AGCCCCTGGG CCCCCTGGCT CCCCTGGCCC CGCTGGTCCA ACTGGCAAGC AAGGAGACAG 3420 
AGGAG AAGCT GGTGCACAAG GCCCCATGGG ACCCTCAGGA CC AGCTGGAG CCCGGGGA AT 3480 
50 CCAGGGTCCT CA AGGCCCCA GAGGTGACAA AGGAGAGGCT GGAGAGCCTG GCGAGAGAGG 3540 
CCTGAAGGGA CACCGTGGCT TCACTGGTCT GCAGGGTCTG CCCGGCCCTC CTGGTCCTTC 3600 
TGGAGACCAA GGTGCTTCTQ GTCCTGCTGG TCCTTCTGGC CCTAGAGGTC CTCCTGGCCC 3660 
CGTCGGTCCC TCTGGCAAAG ATGGTGCTAA TGG AATCCCT GGCCCCATTG GGCCTCCTGG 3720 
TCCCCGTGGA CGATCAGGCG AAACCGGTCC TGCTGGTCCT CCTGGAAATC CTGGGCCCCC 3780 
55 TGGTCCTCCA GGTCCCCCTG GCCCTGGCAT CGACATGTCC GCCTTTGCTG GCTTAGGCCC 3840 

GAGAGAGAAG GGCCCCGACC CCCTGCAGTA CATGCGGGCC GACCAGGCAG CCGGTGGCCT 3900 
GAGACAGCAT GACGCCGAGG TGGATGCCAC ACTCAAGTCC CTCAACAACC AGATTGAGAG 3960 
CATCCGCAGC CCCGAGGGCT CCCGCAAGAA CCCTGCTCGC ACCTGCAGAG ACCTGAAACT 4020 
CTGCCACCCT GAGTGGAAGA GTGGAGACTA CTGGATTGAC CCCAACCAAG GCTGCACCTT 4080 
60 GGACGCCATG AAGGTTTTCT GCAACATGGA GACTGGCGAG ACTTGCGTCT ACCCCAATCC 4140 

AGCAAACGTT CCCAAGAAG A ACTGGTGGAG CAGCAAG AGC AAGGAG AAGA AACACATCTG 4200 
GTTTGGAGAA ACCATCAATG GTGGCTTCCA TTTCAGCTAT GGAGATG ACA ATCTGGCTCC 4260 
CAACACTGCC AACGTCCAGA TGACCTTCCT ACGCCTGCTG TCCACGGAAG GCTCCCAGAA 4320 
CATCACCTAC CACTGCAAGA ACAGCATTGC CTATCTGGAC GAAGCAGCTG GCAACCTCAA 4380 
65 GAAGGCCCTG CTCATCCAGG GCTCCAATGA CGTGGAGATC CGGGCAGAGG GCAATAGCAG 4440 
GTTCACGTAC ACTGCCCTGA AGGATGGCTG CACGAAACAT ACCGGTAAGT GGGGCAAGAC 4500 
TGTTATCGAG TACCGGTCAC AGAAGACCTC ACGCCTCCCC ATCATTGACA TTGCACCCAT 4560 
GGACATAGGA GGGCCCGAGC AGGAATTCGG TGTGGACATA GGGCCGGTCT GCTTCTTGIA 4620 
AAAACCTGAA CCCAGAAACA ACACAATCCG TTGCAAACCC AAAGGACCCA AGTACTTTCC 4680 
70 AATCTCAGTC ACTCTAGGAC TCTGCACTGA ATGGCTG ACC TGACCTGATG TCCATTCATC 4740 
CCACCCTCTC ACAGTTCGGA CTTTTC TCCC CTCTCTTTCT AAGAGACCTG AACTGGGCAG 4800 
ACTGCAAAAT AA AATCTCGG TGTTCTATTT ATTTATTGTC TTCCTGTAAG ACCTTCGGGT 4860 
CAAGGCAGAG GCAGGAAACT AACTGGTGTG AGTCAAATGC CCCCTGAGTG ACTGCCCCCA 4920 
GCCCAGGCCA GAAGACCTCC CTTCAGGTGC CGGGCGCAGG AACTGTGTGT GTCCTACACA 4980 
75 ATGGTGCTAT TCTGTGTCAA ACACCTCTGT A 1 1 1 1 11 AAA ACATCAATTG ATATTAAAAA 5040 
TGAAAAGATT ATTGGAAAGT 



SEQ 10 N&124 PFJ1 Protein sequence: 
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Protein Accession!: NP_001 835.2 

1 11 21 31 41 51 
1 I I I I I 

MDtLGAPQSL VLLTLLVAAV LRCQCQDVQE AGSCVQDCQR YNDKDVWKPE PCRICVCDTG 60 
TVLCDDDCE DVKDCLSPEI PFGECCP1CP TDLATASGQP GPKGQKGEPC D1KDFVGPKG 1 20 
PPGPQGPAGE QGPRGDRGDK GEKGAPGPRG RDGEPGTPGN PGPPGPPGPP GPPGLGGNFA 1 80 
AQMAGGFDEK AGGAQLG VMQ GPMGPMGPRG PPGPAGAPGP QGFQCNPGEP GEPG VSGPMG 240 
PRGPPGPPGK PGDDGEAGKP GKAGERGPPG PQGARGFPGT PGLPGVKGHR GYPGLDGAKG 300 
EAGAPGVKGE SGSPGENGSP GPMGPRGLPG ERGRTGPAGA AGARGNDGQP GPAGPPGPVG 360 
PAGGPGFPGA PGAKGEAGPT GARGPEGAQG PRGEPGTPGS PGPAGASGNP GTDG1PGAKG 420 
SAGAPG1AGA PGFPGPRGPP GPQGATGPLG PKGQTGEPG1 AGFKGEQGPK GEPGPAGPQG 480 
APGPAGEEGK RGARGEPGGV GPIGPPGERG APGNRGFPGQ DGLAGPKGAP GERGPSGLAG 540 
PKGANGDPGR PGEPGLPGAR GLTGRPGDAG PQGKVGPSGA PGEDGRPGPP GPQGARGQPG 600 
VMGFPGPKGA NGEPGKAGEK GLPGAPGLRG LPGKDGETGA AGPPGPAGPA GERGEQGAPG 660 
PSGFQGLPGP PGPPGEGGKP GDQGVPGEAG APGLVGPRGE RGFPGERGSP GAQGLQGPRG 720 
LPGTPGTDGP KGASGPAGPP GAQGPPGLQG MPGERG AAGI AGPKGDRGDV GEKGPEGAPG 780 
KDGGRGLTGP IGPPGPAGAN GEKGEVGPPG PAGSAGARGA PGERGETGPP GPAGFAGPPG 840 
ADGQPGAKGE QGEAGQKGDA GAPGPQGPSG APGPQGPTGV TGPKGARGAQ GPPGATGFPG 900 
AAGRVGPPGS NGNPGPPGPP GPSGKDGPKG ARGDSGPPGR AGEPGLQGPA GPPGEKGEPG 960 
DDGPSGAEGP PGPQGLAGQR GPVGLPGQRG ERGFPGLPGP SGEPGKQGAP GASGDRGPPG 1020 
PVGPPGLTGP AGEPGREGSP GADGPPGRDG AAG VKGDRGE TGAVGAPGAP GPPGSPGPAG 1080 
PTGKQGDRGE AGAQGPMGPS GPAGARGIQG PQGPRGDKGE AGEPGERGLK GHRGFTGLQG 1 140 
LPGPPGPSGD QGASGPAGPS GPRGPPGPVG PSGKDGANGI PGP1GPPGPR GRSGETGPAG 1200 
PPGNPGPPGP PGPPGPGIDM SAFAGLGPRE KGPDPLQYMR ADQAAGGLRQ HDAEVDATLK 1260 
SLNNQIESIR SPEGSRKNPA RTCRDLKLCH PEWKSGDYWI DPNQGCTLDA MKVFCNMETG 1320 
ETCVYPNPAN VPKKNWWSSK SKEKKHIWFG ETINGGFHFS YGDDNLAPNT ANVQMTFLRL 1380 
LSTEGSQNIT YHCKNSIAYL DEAAGNLKKA LLIQGSNDVE IRAEGNSRFT YTALKDGCTK 1440 
HTGKWGKTVI EYRSQKTSRL PUDIAPMDI GGPEQEFGVD 1GPVCFL 



SEQ ID MO:125 PFH9 DMA SEQUENCE 

Nucleic Add Accession*: NM.005084 

Coding sequence: 162-1 487(undenTned sequences correspond to start and stop codons) 

1 11 21 31 4k 51 
I I I t I I 

GCTGGTCGGA GGCTCGCAGT GCTGTCGGCG AG AAGCAGTC GGGTTTGGAG CGCTTGGGTC 60 
GCGTTGGTGC GCGGTGGAAC GCGCCCAGGG ACCCCAGTTC CCGCGAGCAG CTCCGCGCCG 120 
CGCCTGAGAG ACTAAGCTGA AACTGCTGCT CAGCTCCCAA GATQGTGCCA CCCAAATTGC 180 
ATGTGCTTTT CTGCCTCTGC GGCTGCCTGG CTGTGGTTTA TCCTTTTGAC TGGCAATACA 240 
TAAATCCTGT TGCCCATATG AAATCATCAG CATGGGTCAA CAAAATACAA GTACTGATGG 300 
CTGCTGCAAG CTTTGGCCAA ACTAAAATCC CCCGGGGAAA TGGGCCTTAT TCCGTTGGTT 360 
GTACAGACTT AATGTTTGAT CACACTAATA AGGGCACCTT CTTGCGTTTA TATTATCCAT 420 
CCCAAGATAA TGATCGCCTT GACACCCTTT GGATCCCAAA TAAAGAATAT TTTTGGGGTC 480 
TTAGCAAATT TCTTGG AACA CACTGGCTTA TGGGCAACAT TTTGAGGTTA CTCTTTGGTT 540 
CAATGACAAC TCCTGCAAAC TGGAATTCCC CTCTGAGGCC TGGTGAAAAA TATCCACTTG 600 
TTGTTTTTTC TCATGGTCTT GGGGCATTCA GGACACTTTA TTCTGCTATT GGCATTGACC 660 
TGGCATCTCA TGGGTTTATA GTTGCTGCTG TAG AACACAG AG ATAG ATCT GCATCTGCAA 720 
CTTACTATTT CAAGGACCAA TCTGCTGCAG AAATAGGGGA CAAGTCTTGG CTCTACCTTA 780 
GAACCCTGAA ACAAGAGGAG GAG ACACATA TACGAAATGA GCAGGTACGG CAAAGAGCAA 840 
AAGAATGTTC CCAAGCTCTC AGTCTGATTC TTGACATTGA TCATGGAAAG CCAGTGAAGA 900 
ATGCATTAGA TTTAAAGTTT GATATGGAAC AACTGAAGGA CTCTATTGAT AGGGAAAAAA 960 
TAGCAGTAAT TGGACATTCT TTTGGTGG AG CAACGGTTAT TCAGACTCTT AGTG AAG ATC 1 020 
AGAGATTCAG ATGTGGTATT GCCCTGGATG CATGGATGTT TCCACTGGGT GATGAAGTAT 1080 
ATTCCAGAAT TCCTCAGCCC CTCTTTTTTA TCAACTCTG A ATATTTCCAA TATCCTGCTA 1140 
ATATCATAAA AATGAAAAAA TGCTACTCAC CTGATAAAGA AAGAAAGATG ATTACAATCA 1200 
GGGGTTCAGT CCACCAG AAT TTTGCTG ACT TCACTTTTGC AACTGGCAAA ATAATTGG AC 1 260 
ACATGCTCAA ATTAAAGGGA GACATAG ATT CAAATGTAGC TA TTGA TCTT AGCAACAAAG 1 320 
CTTCATTAGC ATTCTTACAA AAGCATTTAG GACTTCATAA AGATTTTGAT CAGTGGGACT 1380 
GCTTGATTGA AGGAGATGAT GAGAATCTTA TTCCAGGGAC CAACATTAAC ACAACCAATC 1440 
AACACATCAT GTTACAGAAC TCTTCAGGAA TAGAGAAATA CAATTAGGAT TAAAATAGGT 1500 
in n 



SEQ tD NO:126PFH? Protein swuence: 
Protein Accession f : NP_005075,1 

1 11 21 31 41 51 

MVPPKLHVLF OjCCCLA VVY PFDWQYINPV AHMKSSAWVN KIQVLMAAAS FGQTKIPRGN 60 
GPYSVGCTDL MFDHTNKGTF LRLYYPSQDN DRLDTLWIPN KEYFWGLSKF LGTHWLMGNI 120 
LRLLFGSMTT PANWNSPLRP GEKYPLWFS HGLGAFRTLY SAIGIDLASH GFTVAAVEHR 180 
DRSASATYYF KDQSAAEIGD KSWLYLRTLK QEEETHIRNE QVRQRAKECS QALSULDtt) 240 
HGKPVKNALD LKFDMBQLKD SIDREK1AV1 GHSFGGATVI QTLSEDQRFR CGIALDAWMF 300 
PUGDEVYSRI PQPLFF1NSE YFQYPANI1K MKKCYSPDKE RKMITIRGSV HQNFADFTFA 360 
TGKHGHMLK LKGDDDSNVA IDLSNKASLA FLQKHLGLHK DFDQWDCUE GDDENLIPGT 420 
NtNTTTNQHIM LQNSSGIEKY N 
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SEQ ID N&127 PFH8 DNA SEQUENCE 

Nuctek Add Accession I: NM_01590O 

Coding sequence: 32-1402 {underlined sequences correspond to start and stop codons) 

1 II 21 31 41 51 
111(11 

CACGAGCGGC ACGAGGATTT CCAGCTCAGC GATGCCCCCA GGTCCCTGGG AGAGCTGCTT 60 
CTGGGTGGGG GGCCTCATTT TGTGGCTCAG CGTTGGAAGT TCAGGGGATG CACCTCCTAC 120 
CCCACAGCCA AAGTGCGCTG ACTTCCAGAG CGCCAACCTT TTTGAAGGCA CCGATCTCAA 180 
AGTCCAGTTT CTCCTCTTTG TCCCTTCGAA TCCTAGCTGT GGGCAGCTAG TAGAAGGAAG 240 
CAGTGACCTC CAAAACTCTG GGTTCAATGC CACTCTGGGA ACCAAACTAA TTATCCATGG 300 
ATTCAGGGTT TTAGG AACAA AGCCTTCCTG GATTG ACACA TTTATTAGAA CCCTTCTGCG 360 
TGCAACGA AT GCTAATGTGA TTGCOGTGG A CTGGATTTAT GGGTCTACAG GAGTCTACTT 420 
CTCAGCTGTG AAAAATGTGA TTAAGTTGAG CCTCGAGATC TCCCTTTTCC TC AATAAACT 480 
CCTGGTGCTG GGTGTGTCGG A ATCCTCAAT CCACATCATT GGTGTTAGCC TGGGGGCCCA 540 
CGTTGGGGGC ATGGTGGGAC AGCTCTTCGG AGGCCAGCTG GGACAGATCA CAGGCCTGG A 600 
CCCCGCTGGA CCTGAGTACA CCAGGGCCAG TGTGGAAGAG CGCTTGGATG CTGGAGATGC 660 
CCTCTTCGTG GAAGCCATCC ACACAGACAC CGACAATTTG GGTATTCGGA TTCCCGTTGG 720 
ACATGTGGAC TACTTCGTCA ACGGAGGCCA AGACCAACCT GGCTGCCCCA CCTTCTTTTA 780 
CGCAGGTTAT AGTTATCTGA TCTGTGATCA CATGAGGGCT GTGCACCTCT ACATCAGCGC 840 
CCTGGAGAAT TCCTGTCCAC TGATGGCCTT TCCCTGTGCC AGCTACAAGG CCTTCCTTGC 900 
TGGACGCTGT CTGGATTGCT TTAACCCTTT TCTGCTTTCC TGCCCAAGG A TAGGACTGGT 960 
GGAACAAGGT GGTGTCAAGA TAGAGCCGCT CCCCAAGGAA GTGAAAGTCT ACCTCCTGAC 1 020 
TACTTCCAGT GCTCCGTACT GCATGCATCA CAGCCTCGTG GAGTTTCACT TGAAGGA ACT 1080 
GAGAAACAAG GACACCAACA TCGAGGTTAC CTTCCTTAGC AGTAACATCA CCTCTTCATC 1 140 
TAAGATCACC ATACCTAAGC AGCAACGCTA TGGGAAAGGA ATCATAGCCC ATGCCACCCC 1200 
ACAATGCCAG ATAAACCAAG TGAAATTCAA GTTTCAGTCT TCCAACCGAG TTTGGAAAAA 1260 
AGACCGGACT ACCATTATTG GGAAGTTCTG CACTGCCCTT TTGCCTGTCA ATGACAGAGA 1320 
AAAGATGGTC TGCTTACCTG AACCAGTGAA CTTACAAGCA AGTGTGACTG TTTCCTGTGA 1380 
CCTGAAGATA GCCTGTGTG T AG TTTAACCT GGGCAGGACA CATCTCCCTG CATTTTTTTT 1440 
TTTTTTTTTTGAGAGAGAGG TGTGATGAGG GATGTGTGTG TGCAGCTTAT TGTAG ACCAT 1500 
TACTACTAAG GAGAAAAGCA AAGCTCTTTC TTATTTTCCT CATAATCAGC TACCCTGG AG 1 560 
GGGAGGGAGA ACTCATTTTA CAGAACTTGG TTTCCTTTGC CGATCTTATG TACATACCCA 1620 
TTTTAGCTTT CCCATGCATA CTTAACTGCA CTTGCTTTAT CTCCTTGGGC ATTCGTACTT 1 680 
AGGATTCAAT AGAAACATGT ACAGGGTAAA CAATTTTTTA AAAATAAAAC TTCATGGAGT 1740 
AAAAAAAAAA AAAAAAAA 



SEQ ID NO:128 PFH8 Protein sequence: 
Protein Accession #: NP_056984.1 

1 11 21 31 41 51 
I I I I I I 

MPPGPWESCF WVGGULWLS VGSSGDAPPT PQPKCADPQS ANLFEGTDLK VQFLLFVPSN 60 
PSCGQLVEGS SDLQNSGFNA TLGTKLUHG FRVLGTKPSW IDTFIRTLLR ATNANV1AVD 120 
WIYGSTGVYF SAVKNVIKLS LEISLFLNKL LVLGVSESSI HDGVSLGAH VGGMVGQLFG 180 
GQLGQITGLD PAGPEYTRAS VEERLDAGDA LFVEATHTDT DNLGIRIPVG HVDYFVNGGQ 240 
DQPGCFTFFY AGYSYLICDH MRAVHLYISA LENSCPLMAF PCASYKAFLA GRCLDCFNPF 300 
LLSCPRIGLV EQGGVKIEPL PKEVKVYLLT TSSAFYCMHH SLVEFHLKEL RNKDTNIEVT 360 
FLSSNITSSS KTTIPKQQRY GKGHAHATP QCQINQVKFK FQSSNRVWKK DRTTHGKFC 420 
TALLPVNDRE KMVCLPEPVN LQAS VTVSCD LKIACV 



SEQ ID NO-.129 PFH7 DNA SEQUENCE 

Nucleic Add Accession #: NM_014384 

Cooing sequence: 89-1336 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I I I I I 

CGTTGCCGGG TCGCAGGTCC CGCCAGTGCG AGCGCAACGG AGGTCGAAGG CGTTCAGACT 60 
CTTAGCTG AA CGCGG AGCTG CGGCGGC TAT GC TGTGG AGC GGCTGCCGGC GTTTCGGGGC 1 20 
GCGCCTCGGC TGCCTGCCCG GCGGTCTCCG GGTCCTCGTC CAGACCGGCC ACCGGAGCTT 180 
GACCTCCTGC ATCGACCCTT CCATGGGACT TAATGAAGAG CAG AAAGAAT TTCAAAAAGT 240 
GGCCTTTGAC TTTGCTGCCC GAGAGATGGC TCCAAATATG GCAGAGTGGG ACCAGAAGGA 300 
GCTGTTCCCA GTGGATGTGA TGCGGAAGGC AGCCCAGCTA GGCTTCGGAG GGGTCTACAT 360 
ACAAACAGAT GTGGGCGGGT CTGGGCTGTC ACGTCTTGAT ACCTCTGTCA TTTTTG AAGC 420 
CTTGGCTACA GGCTGCACCA GCACCACAGC CTATATAAGC ATCCACAACA TGTGTGCCTG 480 
GATGATTGAT AGCTTCGGAA ATG AGGAACA GAGGCACAAA TTTTGCCCAC CGCTCTGTAC 540 
CATGGAGAAG TTTGCTTCCT ACTGCCTCAC TGAACCAGGA AGTGGG AGTG ATGCTGCCTC 600 
TCTTCTGACC TCCGCTAAGA AACAGGGAGA TCATTACATC CTCAATGGCT CCAAGGCCTT 660 
CATCAGTGGT GCTGGTGAGT CAGACATCTA TGTGGTCATG TGCCGAACAG GAGGACCAGG 720 
CCCCAAGGGC ATCTCATGCA TAGTTGTTGA GAAGGGGACC CCTGGCCTCA GCTTTGGCAA 780 
GAAGGAG AAA AAGGTGGGGT GG AACTCCCA GCCAACACG A GCTGTG ATCT TCG AAGACTG 840 
TGCTGTCCCT GTGGCCAACA G AATTGGGAG CGAGGGGCAG GGCTTCCTCA TTGCCGTGAG 900 
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AGGACTGAAC GGAGGGAGGA TCAATATTGC TTCCTCCTCC CTGGGGGCTG CCCACGCCTC 960 
TGTCATCCTC ACCCGAGACC ACCTCAATGT CCGGAAGCAG TTTGGAGAGC CTCTGGCCAG 1020 
TAACCAGTAC TTGCAATTCA CACTGGCTGA TATGGCAACA AGGCTGGTGG CCGCGCGGCT 1080 
GATGGTCCGC AATGCAGCAG TGGCTCTGCA GGAGGAGAGG AAGGATGCAG TGGCCTTGTG J 140 
CTCCATGGCC AAGCTCTTTG CTACAGATGA ATGCTTTGCC ATCTGCAACC AGGCCTTGCA 1200 
GATGCACGGG GGCTACGGCT ACCTGAAGGA TTACGCTGTT CAGCAGTACG TGCGGGACTC 1260 
CAGGGTCCAC CAG ATTCTAG AAGGTAGCAA TGAAGTGATG AGGATACTGA TCTCTAG AAG 1 320 
CCTGCTTCAG GAGTAGAACC CACACTTGTT CTGGCCTGCT GTTCAGTGCG ACTGCAGTCA 1 380 
GTGTTGAGTG GTGCCATGTG GGCCGCTCTA TTCCAAAGGA ATCATGGATT AGACCCAAGG 1440 
GCTGAGCTCC TCTAGGGCAG GACCTGCACC CTGTGTGTTG GCACCAGCAT CGGGTCTTGG 1500 
ACTGGGGCAG AATCCCCAGT GG AACCGGAA G AGCTGGACT GATG AGAAAC ATCAG AAGAA 1560 
CACATACTAC CTTG TTTTCC TAATGCCAGA AGGGTG ACCA GTGAAGATTC ACCGTCAAAC 1 620 
CATGAAAGTC CTTTCTTGGA TCCACTTTAT CTTGATTAGT CTGCATTTTA CTAGTTCACT 1 680 
GGATCCCTCC TCTAGGGGCC TGGGG ACTTT CACTGATGCT CTTCCTGATT CTAGAGCAAA 1 740 
GGTGTGGGAA GGGG AAATGG AGG AATGCCC TCCTGTCTGT GTCGTTCTCT GTGCCACAGC 1 800 
TACAGATGCA GAAGGTTTCT CTGGATAGCA CACCTCTGAA TGTAAATCAT GATAAAATGG 1 860 
ATATTTGGAA ACTTACTCCT AAGCTGTGAT GTAGGGTGTA TTTCTACTTC TGGACTGCCT 1920 
CAATATCAAG GGCTGAGACT TTTGAATGTT GAATATTCGT TGGGTTTCAT GTTAAGACGC 1980 
CTGT GGTCC A GGAGTGCTAT TCAGTGTTTC TGTTCCTG AT AAACACTTTG AATATTTTTT 2040 
TGTGTTTTTG TTTCCTTTTC TGAAGCTGTT CCTCCTTTTA AATATTTTTA ATCACATTGA 2100 
TAAAATCTAT CCTTCATCCA CCTCTGGTTC TACTATAGTT GATTTTTATT TTAAATGTTT 2160 
AATTGTATTT GATTAAACAC TTAACTGG AT TTTGG AATAA TAAAACTCTC GTCCAATTTG 2220 
GCTTTTAAAAAAAAAAAA 



Protein Accession #: NP.0551 99.1 



1 11 21 3] 41 51 
I I I t I I 

MLWSGCRRFG ARLGCLPGGL RVLVQTGHRS LTSCIDPSMG LNEEQKEFQK VAFDFAAREM 60 
APNMAEWDQK ELFPVDVMRK AAQLGFGGVY IQTDVGGSGL SRLDTSVIFE ALATGCTSTT 1 20 
AYISIHNMCA WMIDSFGNEE QRHKFCPPLC TMEKFAS YCL TEPGSGSDAA SLLTS AKXQG 180 
DHYILNGSKA FISGAGE5D1 YWMCRTGGP GPKGISCIW EKGTPGLSFG KKEKKVGWNS 240 
QPTRAVIFED CAVPVANRIG SEGQGFUAV RGLNGGRIN1 ASCSLGAAHA SVILTRDHLN 300 
VRKQFGEPLA SNQYLQFTLA DMATRLVAAR LM VRNAAVAL QEERKDAVAL CSMAKLFATD 360 
ECFAICNQAL QMHGGYGYLK DYAVQQYVRD SRVHQILEGS NEVMRHJSR SLLQE 



SEQ ID N0:131 PFH6 DNA SEQUENCE 

Nucleic Acid Accession I: NM_013989 

Coding sequence: 707-1 1 05(underfined sequences correspond to start and stop cocoas) 

1 11 21 31 41 SI 
I I I I I I 

GCCTGCAGAG AGAGGCACTT TGCACCACAG ACAGATAGCA AGAAGGGAAA GACAG AGAGT 60 
GAGAAAAAAG AGGAGTCAGT CGCTCCTGGG GAAGGGAGAG AGTGAGACTG GGAGAAAGAG 120 
AAGCACAGAA AGTGTGTGTA AAACGGAGTA AAGAAAGAAA AAAAAAAAAC TACCCTTAAA 1 80 
GCACATTTAA AAAAAAAAAA CTCTGGCAAT TCAAGAAAGA AACAGGCTAC GTTTAAAGAG 240 
CATAGAGACA ATGAA AGGCT AAAGAAAATT TT AAAATCTC TGCCACAGTC TCATAGGTGC 300 
TTGGAAATGA AAGTAGAACT GCCTGTC l" IT AACGG ACTCT GACAGAGGTA ACTGGATTAG 360 
GG ACGAGTAC GCCAGCTTTT 1111111111 I 111 HIT II TTTAACATCT TAAATCCTGA 420 
AAAAAAAAAA AAAAAAAAAA AAAAGGCAGC AGCTCCGAAT TG AATGAATT GATGGGCACA 480 
CTCCAACTGC TGGGCTGGAG AGACTGGACT TAGTCTTGCC ATTTCTGCTT CTTTGAAAG A 540 
GG AGACAACT TCCGCTTCCT TTTAATTTAG TIT I I IT ICC CCTTCTCCCC CAACCCCCAA 600 
CCTTCCCCCT TACCTCCCCC ACCCCCTTTA TCACCACCCC CCTTTTAAAT AAGAGGGTGA 660 
AGGGGAACCA GAGCGCACAA GGGAACTGAC TCA GG AGGC A GAGAAGAJfiG GCATCCTCAG 720 
CGTAGACTTG CTGATCACAC TGCAAATTCT GCCAGTTTTT TTCTCCAACT GCCTCTTCCT 780 
GGCTCTCTAT GACTCGGTCA TTCTGCTCAA GCACGTGGTG CTGCTGTTG A GCCGCTCCAA 840 
GTCCACTCGC GGAGAGTGGC GGCGCATGCT GACCTCAGAG GG ACTGCGCT GCGTCTGGAA 900 
GAGCTTCCTC CTCGATGCCT ACAAACAGGT GAAATTGGGT GAGGATGCCC CCAATTCCAG 960 
TGTGGTGCAT GTCTCCAGTA CAGAAGG AGG TG ACAACAGT GGCAATGGTA CCCAGG AG AA 1020 
GATAGCTGAG GGAGCCACAT GCCACCTTCT TGACTTTGCC AGCCCTGAGC GCCCACTAGT 1080 
GGTCAACTTT GGCTC AGCCA CTTGACCTCC TTTCACGAGC CAGCTGCCAG CCTTCCGC A A 1 1 40 
ACTGGTGGAA GAGTTCTCCT CAGTGGCTGA CTTCCTGCTG GTCTACATTG ATGAGGCTCA 1200 
TCCATCAGAT GGCTGGGCGA TACCGGGGGA CTCCTCTTTG TCTTTTGAGG TGAAGAAGCA 1260 
CCAGAACCAG GAAGATCGAT GTGCAGCAGC CCAGCAGCTT CTGGAGCGTT TCTCCTTGCC 1320 
GCCCCAGTGC CGAGTTGTGG CTGACCGCAT GGACAATAAC GCCAACATAG CTTACGGGGT 1380 
AGCCTTTGAA CGTGTGTGCA TTGTGCAGAG ACAGAAAATT GCTTATCTGG GAGGAAAGGG 1440 
CCCCTTCTCC TACAACCTTC AAGAAGTCCG GCATTGGCTG GAGAAGAATT TCAGCAAG AG 1500 
ATGAAAGAAA ACTAGATTAG CTGGTTAAAG GTATGATTAT AAGAGAGCTT ATTGTTTTAA 1560 
AAAGTTATAT AAAGGCAAGC AAATTAAG AA CTG AATCCAT ATTTCAAC AG AGCCCTATTG 1 620 
GCTTACTGAA AG ACAGG AGT TTATCTATCG GAAG AACATG AATCTCTAAC AGCTCCATAC 1680 
TTCTTTCACT ACTCAAATGG GATTGGGCTG AGTAAGTAAC CATATCACCT CTCTTCTTAG 1740 
TAAAAAGCCC TATGTGAAAA GATCCCAAGA TGGAG AGGAA GAAACGCTAA TTCAGCATGT 1 800 
GTTCATTCTG CATTGAGAAG GAACTGATAC ATCTGATGCA TGCTTTGAGA CCAGAAGAAA 1860 
AGACTTACCT G AATAATTAC TACATTAGGG AAGCTACTGT CTACGTTAAG ATAAAGGGTA 1920 
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TTGCCTTGGC TCTATTTGGC ATCGATGGAG CCCAGTTGGA AAATTCCCAA ATATTACAAC 1980 
AAGTCCTTG A ACCCAGGCCA TGTGGTTAGA CGTTGGTGTT A AGGTTAGAC CTTATGTTAG 2040 
AGTCATTTCT GATGTTCCAG CTTCTAGCCA TGTAGTGCTC TCAGTCTTCA TACCCCAGAA 2100 
ATTATTGGTA TATTTGTAGA TACCGAGAAT GATCCCTCAG TCTGAGAGGT TAGAATGATC 2160 
ATCTGTAATC TGAGGGTTAA TTTCTAGGCA GGTGGAGAG A GTGGTAAAAA AGAAATGAAA 2220 
TTGACAAGCT AGG AAAGAGG AGGCAGAAAG ATTTGGAAAA TTCACAGAGT TTCACCCTTA 2280 
AGCTGTAG AG AGTGGGTCAC ATTTGTTAGC CACGGAAACA TAGAAACATA CACAAGGCCA 2340 
GAAAAAGAAG AAGG AGCTCA ACTAAAAGTG GCATAG AGAA TACACATATA AAAACAATAT 2400 
ATTTGTCATA TGCTCCTAGA GAGGAGAAAG GGGTGATTGA AAG AAAAAAA AATACTTAAA 2460 
TATTTGTAAT TGTGAGGGGT TTCTTTTGGA A ATA ATT ACT TTTG AACCAT GTATGTGGTA 2520 
TGTATATTTT CAGTGGGTTA ATTATACCCC ATG ATACCTA TTAAAGG AAA ACCAGTGGGT 2380 
CTGGTGGTGC TGGTCTTTTC CTCCCCATTC CTACAATTTC TATGTGGCCC AAGTCATTCC 2640 
TAATCTTGGT CTCTATAGCA GTGTTCTCTC TGAATGCTGA GCTG AAG AAA TTATACGTAC 2700 
ATACACACAT ACATACATAC ATACA AATAT ATGTATATAT ATTCTCAGCT GCTGCGGGAG 2760 
GTAGGTACCA TGGCCATTCA GCACAGCCTT GATTTCCTCC CAAAGTAGGT G AGCTATAGT 2820 
GAAGAATAGG TGCAAACAA A CAAGCTTACT TCCATTGCAA AATAG AAGAA GAGGAAGTTA 2880 
GAGATAATTC TGATCAATCA TTTTGGAGGC TTTGTTATAA GGCAACCCCC GGTATATCAT 2940 
GGAATTTCCA TTG ACATTTG AATTTGGACT TGGATCTTCC CTTGGTCCCA TTAGCTGAGG 3000 
TTTAGTAATC TAAAGTCCCT ATAGTATATG ATTATAATGC TATTTTAAAA AATATATATA 3060 
TAAAATATTT TTTTCTTTTT AAAATAGACA CTATAGTTTT ACCCATAAGT AATATTTAAA 3120 
GATTATAGCT CCCAAAAGAA TGG ACCAACC ACTTTCGTAT CATAATTTCT TTTTGGTAAA 3 1 80 
TATG AGACTA TTATGAAATC ATAGTATATG ATTGTATTTA AAGGTACAAT CAAAGGATCT 3240 
TTTGTCCATT CCATTAATAA CTGAATAAAA AATAAATAAA ATGGATAG AA AAAAACTAAA 3300 
GTTGAAAATA CATTC TTAAA CTAGTTGTCT GAAATGAGAA AAGAGTGAGA ACTAGGTGTG 3360 
CAAGAACCAA ACGTATTTTA TTTTATTTTT TAAATGGG AG CAACATATCA GTCGTGTCAC 3420 
CAGCTGGTAT ATTGTGTAAA TATTAAAGCT CCATTGGG AC TGATTTTTCA TGGCAACATC 3480 
AGCTTTCTAA TGTTCTAAAT TCTATAAAAA CCACCCACAA AGAAACAAAG CAAATTTCAT 3540 
TATCTAATGA GTTGCTGGAA AATCATATTG AGAATAATTA TTTCAGATTC CTCAGTTGTT 3600 
AACTTCTACA TTCAAGGGCT TATCTCTGCC CCCATTGATT TTTAACCTCA AAATGGTGTG 3660 
AGATTTACTG TGG AACCCTA AAGCAGTAAA ATAAAAAACC TGGTTGCAGC ACATTCACAC 3720 
TGTTGTCCTT AAAATTCCCC TTTTTTCTCT ATGTACGATA AAGTAACAGT ATGTCAGATA 3780 
AGCCGGTGGG GGGATGAGAT TAGGCTGAGG CAGTGCTAGT CAACTGGGGG AAAAGGATGA 3840 
TGGAAAAATC ACCCAGTTGT GCTATATTTT TAAAG AAGGA GGTCGTTTAT GTGTGCAGAC 3900 
AATTCTCCCT GAGGTTAGCC CAATGGAGAA ATG AAGCAGA GGAAGGAAAC ATAG AAAG AC 3960 
ATGGGCTATC AGGGAGGAAG ATGTTCAATA GAACATGCAA GAATTTCTGG AAGAAAGGCT 4020 
GTGGAAGGGC CAATGGAGAA AATGA ATGG A CAAAGCTCAG GAATCCCTAC GCTATGTAGA 4080 
AT GTTCT TGG TGTTATCAGG GTTAAGCCCT GTAATTATGT AACCTATTTA TCGCAACATG 4140 
AATTTTTATG ATTTCTTGTG ATGTATTCTT TTATG AAATT AACAAGAACT CATTATTTTG 4200 
AGGTAGAGGA AAATCAATGC TTTATCTGAT ATGCTGAG AA ATTATTAG AT TGCCAATACT 4260 
CATGTGCGTT TCATGTGTTT TATAAGGTTT GTTCCTTTGA AGAATTGTAG TTCTTAGTCC 4320 
CAC AGGG AAA TGTGTATCTA TTTATATATC ATAGTATAAA TCTATG ATAT ATTTATATC A 4380 
TATATAAAAG TCTGAGTTCT CTTTCTTAGT CCCTA ATCAT GTTTCTCCCA TAGGCTGTGT 4440 . 
TTACATGGAG CTATC GGTTT AGCCTTTTAA GCTTCATTAG CTTGTCTATT ATTGAAATAG 4500 
TTTCCAAGAA ATTTTAG ATA TTATCATAAC ATCTGGGTCT ACTCAAACAC TTATTGTTTG 4560 
AAAGACTTAT GTCTTGGACC TATCAA AAAC TGACTTTATT TATTGCTTAG TGAAAATACT 4620 
AGTGGGATCA ACAATG ATTT TCTTGAATGG GCATGAATGG AGATGCCCGC ACAGTAATGT 4680 
AGAAATGTTT CATACAGCTA TTAAAATGTA ACTG ACCTCC TTAGAGGCAG ATTAGTAACT 4740 
GTTCCTACTT TGTATAGCTA AGTGACAGTC ACTTAACTTA CATGACTTTC TTTTTTCACA 4800 
TTGGGTCTCT GGTCCTGTGT CTTCACCTCA TTTATAGCAC GTCTCCTTGA TTTTTGGTAG 4860 
TATCAACTTC CCAGTG ATCT GTTCAGTTAA GTTCTTCTCC CGTTAACCAG GAAGTGCTTA 4920 
TTCTCTCATC ACAGTGGGAA GAATAGCCTA TTGTCTTTCA TTTTGCCTGA GTGTATTTTA 4980 
CTATTTGGGC TCTG AAATAA AAATTATG AA ATATGGTG AG GTCACATGTT GGTGCTGCCT 5040 
TGCTGCATAA AATTCTAGGA GGGCAGGTTA GGAGACAGTT ATGTATGGCC TTTCGGGAAA 5100 
ATTCAAAGGG TGGGATTACA AGGGTGTTCC TCAGGCATGC CCCTATGGGC CCTATGTGGA 5160 
AGCAAGAAGA ATTG ACTGAT TTACAGG ACT TCTCTTTATG TCAATCTTAA GAGGATGGAT 5220 
GAATCTGG AC ATTTGTTCCA CCCGACCTCT GACTG ATGGT TTGGAAAATA ACTTTAATTA 5280 
GGATCATATG ACC ATTG AAA AAGGAAAAAT GTAGACTCTG ACTTCCGTCC CACTG AAGGA 5340 
TTAATGAAAA CCTTTACTAG CATTTAGAGC TTTTCAGAAC ATCCCCACTG TCATGTGTCT 5400 
CAGCAGTGGA GACTGCAAGT AAGGCTTTTA ATTTTAGGAG Gill 111 IT I III Mill II 5460 
TTCCCCTAAA TGGTATGGCC AAAAGTCAG A GTTAA AATAT ATATAGTTAG ATTCGAACTT 5520 
CCTCCTTCAC TCT A AAA ATA GAATCCAAAC CCACTCTTCA TATATGCTTC CAGAATGGGG 5580 
CTTAAGTACC AATCTCTGCT TTGCAATGGG CACAATCTTG GTCATGTCCT GAGGCTCTCT 5640 
AAGAAAAGAG AGGATCTAGG ATGGGAGAGC TAG AAAGTTG CTAACTGGGA AGAACAAGGC 5700 
CCTGAGGGGT TGGTCTACCA ATCTGGGAAG ATTTGAAAAC AAACTTCTCG CAACTGAAGG 5760 
AAGGCTG AAG GCTGCTGCA A GTCATTGAGT G ACTTTAGG A TGAGCAAAAC ATTGGGCCAC 5820 
TTCCTAATGC CCTATGTGTA TAGTACCAG A AGCAAGGTCT CAGACTTAAC AGACCCAGCT 5880 
CTGTTCCAAG GTGAGTCTGA ACCAATAGAA AGCAAACATG TGCAGATATC CAAACAAGAC 5940 
TGCTCATGCA AGTCGGGGCT GGCTACCCGT CTTAGGCAGC AACAGCAGAG CTCCACGGAG 6000 
CTTATTCAAT ATTTACTGAG ACTTCGAAGA CCCAGCAGAT GTTTAATGAA GTCACTATTT 6060 
TGGCTCAAAC CCTCCACTTC TCCCCCTCCC CTCAAAAAGC CAACAGGTAA ACACATAAAT 6120 
GAAAGAAACC CACAGAAGGG GATGGGAAAT AAAGAAAATT CTCTCAAGAC TTCTCCAGGC 6180 
CCATGTCACT GGTC AGCGTG GTTTTTATGT GTATTAGG AT TGGGGGATGT GAAGAAATAA 6240 
GTA TCCAG T A CTTT ATAACC AAAGCAATTA AATG ATATTG GGGTAGGGAA TGTTGGCCAG 6300 
TTTTGTTTAG TTTTGCCATC ACATTGTCAC CCAGACCTCA CCTAGCCCCA AGTAATCGGG 6360 
CGCCCCGAAG AGGGAGACAG AGATGTGCCA GAGTTGACCC AGTGTGCGGA TGATAACTAC 6420 
TGACGAAAG A GTCATCGACC TCAGTTAGTG GTTGGATGTA GTCACATTAG TTTGCCTCTC 6480 
CCCATCTTTG TCTCCCTGGC AAGGAGAATA TGCGGGACAT GATGCTAAGA GCCCTGGGTA 6540 
AATGTGGTG A GAATGCACGC GTGCATATGC TACACATATG TGCTTCTCAG TTGCAGAAAA 6600 
TGAACTGCTT TGGG AGATTA TCAGTAGAAA GAGTGTTATC ATATTGGTGC TGAGTGCTAT 6660 
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GTGTGCTTAT ACAATTTGTT CTTGTATTTT AATAAACTTT GAATA AAAGA ATAAAA AAAA 6720 
A AAAA AA AAA AAAAA 



gEQlDNO^ypFHgP^nswvengg; 
Protein Accession #: NP_054644.1 

1 11 21 31 41 51 
I I I I I I 

MGILSVDLU TLQILPVFFS NCLFLALYDS VUXKHVVLL LSRSKSTRGE WRRMLTSEGL 60 
RCVWKSFLLD A YKQVKLGED APNSSV VHVS STEGGDNSGN GTQEKJAEGA TCHLLDFASP 1 20 
ERPLVVNFGS ATXPPFTSQL PAFRKLVEEF SS V ADFLLVY IDEAHPSDGW AiPGDSSLSF 1 80 
EVKKHQNQED RCAAAQQLLE RFSLPPQCRV VADRMDNNAN 1AYGVAFERV CIVQRQK1AY 240 
LGGKGPFSYN LQEVRHWLEK NFSKRXKKTR LAG 



SEQID NO:133 PFH5 OKA SEQUENCE 

Nucleic Add Accession*: NM.001141 

Cooing sequence: 72-2102 (underflned sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

CAGGCGTGTC CCAGGGGGAG CCCCGCTCTG CAGCCCTGTG CGCCGTAG AG AGCTGGACTT 60 
AGGCTGGCAG CAIGGCCGAG TTCAGGGTCA CGGTGTCCAC CGGAGAAGCC TTCGGGGCTG 120 
GCACATGGG A CAAAGTGTCT GTCAGC ATCG TGGGGACCCG GGGAG AG AGC CCCCCACTGC 1 80 
CCCTGGACAA TCTCGGCAAG GAGTTCACTG CGGGCGCTGA GGAGGACTTC CAGGTGACGC 240 
TCCCGGAGGA CGTAGGCCGA GTGCTGCTGC TGCGCGTGCA CAAGGCG CCC CCAGTGCTGC 300 
CCCTGCTGGG GCCCCTGGCC CCGGATGCCT GGTTCTGCCG CTGGTTCCAG CTGACACCGC 360 
CGCGGGGCGG CCACCTCCTC TTCCCCTGCT ACCAGTGGCT GGAGGGGGCG GGGACCCTGG 420 
TGCTGCAGGA GGGTACAGCC AAGGTGTCCT GGGCAGACCA CCACCCTGTG CTCCAGCAAC 480 
AGCGCCAGGA GGAGCTTCAG GCCCGGCAGG AGATGTACCA GTGGAAGGCT TACAACCCAG 540 
GTTGGCCTCA CTGCCTGGAT G AAAAGACAG TGGAAGACTT GG AGCTCAAT ATCAAATACT 600 
CCACAGCCA A GAATGCCAAC TTTTATCTAC AAGCTGGCTC TGCTTTTGCA GAGATGAAAA 660 
TCAAGGGGTT GCTGGACCGC AAGGGGCTCT GGAGGAGTCT GAATGAGATG AAAAGGATCT 720 
TCAACTTCCG GAGGACCCCA GCAGCTGAGC ACGCATTTGA GCACTGGCAG GAGGATGCCT 780 
TCTTCGCCTC CCAGTTCCTG AATGGTCTCA ACCCTGTCCT GATCCGCCGC TGTCACTACC 840 
TCCCAAAGAA CTTCCCCGTC ACTGATGCCA TGGTGGCCTC ATTGTTGGGT CCTGGGACCA 900 
GCTTGCAGGC TGAGCTAGAG AAGGGCTCCC TGTTCTTGGT GGATCACGGC ATCCTCTCTG 960 
GCATCCAGAC CAATGTCATT AATGGGAAGC CGCAGTTCTC TGCGGCCCCA ATGACCCTGC 1020 
TATACCAGAG CCCAGGCTGC GGGCCGCTGC TGCCTCTCGC CATCCAGCTC AGCCAGACCC 1080 
CCGGCCCAAA CAGCCCCATC TTCCTGCCCA CTGATGACAA GTGGGACTGG TTGCTGGCCA 1 140 
AGACCTGGGT GCGCAATGCC GAGTTCTCCT TCCATGAGGC CCTCACGCAC CTGCTGCACT 1200 
CACATCTGCT GCCTG AGGTC TTCACCCTGG CTACCCTGCG TCAGCTGCCC CACTGCCACC 1260 
CTCTCTTCAA GCTGCTGATC CCGCACACCC GATACACCCT GCACATCAAC ACACTCGCCC 1 320 
GGGAGCTGCT TATCGTGCCA GGGCAGGTGG TGGACAGGTC CACAGGCATC GGCATTGAAG 1380 
GCTTCTCTGA GTTGATACAG AGGAACATGA AGCAGCTGAA CTATTCTCTC CTGTGTCTGC 1440 
CTGAGGATAT CCGGACCCGA GGAGTTGAAG ACATCCCAGG CTACTACTAC CGTGATGATG 1500 
GGATGCAG AT TTGGGGTGCA GTGGAACGCT TTGTCTCTG A AATCATCGGT ATCTACTACC 1560 
CAAGTGATGA GTCTGTCCAA GATGACAGAG AGCTCCAGGC CTGGGTCAGA GAGATCTTCT 1620 
CCAAGGGCTT CCTAAACCAG GAGAGCTCAG GTATCCCTTC CTCACTGGAG ACCCGGGAAG 1680 
CCCTGGTGCA GTATGTCACC ATGGTGATAT TCACCTGCTC AGCCAAGCAT GCGGCTGTCA 1740 
GTGCAGGGCA GTTTGACTCC TGTGCTTGGA TGCCCAACCT GCCACCCAGC ATGCAGCTGC 1800 
CACCACCCAC CTCCAAAGGC CTGGCAACAT GCGAGGGCTT CATAGCCACC CTCCCACCTG 1 860 
TCAATGCCAC ATGTGATGTC ATCCTTGCTC TCTGGTTGCT GAGCAAGGAG CCTGGAGACC 1920 
AAAGGCCCCT GGGCACCTAT CCGGATG AGC ACTTCACAGA GGAGGCCCCT CGGCGGAGCA 1980 
TCGCCACCTT CCAG AGCCGC CTGGCCCAGA TCTCGAGGGG CATCCAGGAG CGGAACCGGG 2040 
GCCTGGTGCT GCCCTACACC TACCTAGACC CTCCCCTCAT CGAGAACAGC GTCTCCATCI 2100 
AAATCCCAGG GGAACACAGG CCCAGATGAC ATCCCTTTGA CCACATCGCT CTAGG ATAAC 2160 
TGGCACCCAG AG AAAAGGAC TCCTCAGAAA AAACAGGCCC CCATGTGCCT CTCCTGGGAC 2220 
AACCAGACTC TGTAACTCAC CCCCACCACC ATACACACAC ACAAAAACAG AAACAAAATC 2280 
AAAACAGAGA AAGCAG AAAA TCTACCAAG A ACAGAGTCTC AGGACAGAAC CACTGAGTCT 2340 
TTTGGAGGCT CCAAGCCTCA AAGTGCCCGC AGAGCCCACC TTGAGGGTTT TGCTAGTTGG 2400 
TTTTGTTTTG CGTTTACAGC CGTGGGGGG A AGCACATAAT CCCGCCCCAG GGCCCACTAG 2460 
CATCCACTGA TTGGACCTTA TGGTCACCCA ACTCAAGGAC AGCCACCAAG AAGTGGCTGC 2520 
CAAAGAGACT GGGCGCAGTG GCTCATGCCC ATAATCCCAG CACTTTGGG A GATGG AGGCG 2580 
GGAAAATCAT TTG AGGTCAG AAGTTCAAGG CCAGCCTGG A CGACATAGCG AGACTCCACC 2640 
TCTACCAAAA AATAAAA ATT AAAAAACAAA AAAAAAAAAA AAAAA 



SEP ID NO:134 PF H5 Protein sequence: 
Protein Accession #: NP_001132.1 

I II 21 31 41 51 
I I I I t I 

MAEFRVRVST GEAFGAGTWD KVSVSrVGTR GESPPLPLDN LGKEFTAGAE EDFQVTLPED 60 
VGRVLLLRVH KAPPVLPLLG PLAPDAWFCR WFQLTPPRGG HULFPCYQWL EGAGTLVLQE 120 
GTAKVSWADH HPVLQQQRQE ELQARQEMYQ WKAYNPGWPH CLDEKTVEDL ELNIKYSTAK 180 
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10 
15 



NAN FY LQ AGS AFAEMKIKGL LDRKGLWRSL NEMKRIFNFR RTPAAEHAFE HWQEDAFFAS 240 
QFLNGLNPVL IRRCHYLPKN FPVTDAMVAS LLGPGTSLQA ELEKGSLFLV DHGILSGIQT 300 
NVINGKPQFS AAPMTLLYQS PGCGPLLPLA IQLSQTPGPN SPIFLPTDDK WDWLLAKTWV 360 
RNAEFSFHEA LTHLLHSHLL PEVFTLATLR QLPHCHPLFK LUPHTRYTL HINTLARELL 420 
IVPGQWDRS TGIG1EGFSE LIQRNMKQLN YSLLCLPEDI RTRGVEDDPG YYYRDDGMQ1 480 
WGAVERFVSE HGIYYPSDE SVQDDRELQA WVREIFSKGF LNQESSG1PS SLETREALVQ 540 
YVTMVIFTCS AKHAA VSAGQ FDSCAWMPNL PPSMQLPPPT SKGLATCEGF IATLPPVNAT 600 
CDVILALWLL SKEPGDQRPL GTYPDEHFTE EAPRRSIATF QSRLAQISRG IQERNRGLVL 660 
PYTYLDPPLI ENSVSI 



SEG ID N&135 PFH4 DNA SEQUENCE 

Nucleic Acid Accession*: NM.002742 

Coding sequence: 236-2974 (undefined sequences correspond to start and stop codons) 



1 11 21 31 4! 5! 
I I I I I I 

GAATTCCTTC TCTCCTCCTC CTCGCCCTTC TCCTCGCCCT CCTCCTCCTC CTCGCCCTCC 60 

20 CCTCCCGATC CTCATCCCCT TGCCCTCCCC CAGCCCAGGG ACTTTTCCGG AAAGTTTTTA 120 
TTTTCCGTCT GGGCTCTCGG AGAAAGAAGC TCCTGGCTCA GCGGCTGCAA AACTTTCCTG 180 
CTGCCGCGCC GCCAGCCCCC GCCCTCCGCT GCCCGGCCCT GCGCCCCGCC GAGCGATQAG 240 
CGCCCCTCCG GTCCTGCGGC CGCCCAGTCC GCTGCTGCCC GTGGCGGCGG CAGCTGCCGC 300 
AGCGGCCGCC GCACTGGTCC CAGGGTCCGG GCCCGGGCCC GCGCCGTTCT TGGCTCCTGT 360 

25 CGCGGCCCCG GTCGGGGGCA TCTCGTTCCA TCTGCAG ATC GGCCTGAGCC GTGAGCCGGT 420 
GCTGCTGCTG CAGGACTCGT CCGGGGACTA CAGCCTGGCG CACGTCCGCG AGATGGCTTG 480 
CTCCATTGTC GACCAGAAGT TCCCTGAATG TGGTTTCTAC GG AATGTATG ATAAGATCCT 540 
GCTTTTTCGC CATGACCCTA CCTCTGAAAA CATCCTTCAG CTGGTGAAAG CGGCCAGTGA 600 
TATCCAGGAA GGCGATCTTA TTGAAGTGGT CTTGTCACGT TCCGCCACCT TTGAAG ACTT 660 

30 TCAGATTCGT CCCCACGCTC TCTTTGTTCA TTCATACAGA GCTCCAGCTT TCTGTGATCA 720 

CTGTGGAGAA ATGCTGTGGG GGCTGGTACG TCAAGGTCTT AAATGTGAAG GGTGTGGTCT 780 
GAATTACCAT AAGAGATGTG CATTTAAAAT ACCCAACAAT TGCAGCGGTG TGAGGCGGAG 840 
AAGGCTCTCA AACGTTTCCC TCACTGGGGT CAGCACC ATC CGCACATCAT CTGCTGAACT 900 
CTCTACAAGT GCCCCTGATG AGCCCCTTCT GCAAAAATCA CCATCAGAGT CGTTTATTGG 960 

35 TCGAGAGAAG AGGTCAAATT CTCAATCATA CATTGGACGA CCAATTCACC TTGACAAGAT 1020 
TTTGATGTCT AAAGTTAAAG TGCCGCACAC ATTTGTCATC CACTCCTACA CCCGGCCCAC 1080 
AGTGTGCCAG TACTGCAAGA AGCTTCTGAA GGGGCTTTTC AGGCAGGGCT TGCAGTGCAA 1 140 
AGATTGCAGA TTCAACTGCC ATAAACGTTG TGCACCGAAA GTACCAAACA ACTGCCTTGG 1200 
CGAAGTGACC ATTAATGGAG ATTTGCTTAG CCCTGGGGCA GAGTCTGATG TGGTCATGGA 1260 

40 AG AAGGGAGT GATGACAATG ATAGTG AAAG GAACAGTGGG CTCATGGATG ATATGGAAG A 1 320 
AGCAATGGTC CAAGATGCAG AGATGGCAAT GGCAGAGTGC CAGAACGACA GTGGCGAGAT 1380 
GCAAGATCCA GACCCAG ACC ACGAGG ACGC CAACAGAACC ATCAGTCCAT CAACAAGCAA 1440 
CAATATCCCA CTCATGAGGG TAGTGCAGTC TGTCAAACAC ACGAAGAGGA AAAGCAGCAC 1500 
AGTCATGAAA GAAGGATGGA TGGTCCACTA CACCAGCAAG GACACGCTGC GGAAACGGCA 1560 

45 CTATTGGAGA TTGGATAGCA AATGTATTAC CCTCTTTCAG AATGAGACAG G AAGCAGGTA 1620 
CTACAAGGAA ATTCCTTTAT CTGAAATTTT GTCTCTGGAA CCAGTAAAAA CTTCAGCTTT 1680 
AATTCCTAAT GGGGCCAATC CTCATTGTTT CGAAATCACT ACGGCAAATG TAGTGTATTA 1740 
TGTGGGAGAA AATGTGGTCA ATCCTTCCAG CCCATCACCA AATAACAGTG TTCTCACCAG 1800 
TGGCGTTGGT GCAGATGTGG CCAGGATGTG GGAGATAGCC ATCCAGCATG CCCTTATGCC 1860 

50 CGTC ATTCCC AAGGGCTCCT CCGTGGGTAC AGGAACCAAC TTGCACAGAG ATATCTCTGT 1920 
GAGTATTTCA GTATC AA ATT GCCAG ATTCA AGAAAATGTG GACATCAGCA CAGTATATCA 1980 
GATTTTTCCT GATGAAGTAC TGGGTTCTGG ACAGTTTGG A ATTGTTTATG GAGG AAAACA 2040 
TCGTAAAACA GGAAGAGATG TAGCTATTAA AATCATTGAC AAATTACG AT TTCCAACAAA 2100 
. ACAAGAAAGC CAGCTTCGTA ATGAGGTTGC AATTCTACAG AACCTTCATC ACCCTGGTGT 2160 

55 TGTAAATTTG GAGTGTATGT TTGAGACGCC TGAAAGAGTG TTTGTTGTTA TGGAAAAACT 2220 
CCATGG AGAC ATGCTGGAAA TGATCTTGTC AAGTGAAAAG GGCAGGTTGC CAGAGCACAT 2280 
AACGAAGTTT TTAATTACTC AGATACTCGT GGCTTTGCGG CACCTTCATT TTAAAAATAT 2340 
CGTTCACTGT GACCTCAAAC CAGAAAATGT GTTGCTAGCC TCAGCTG ATC CTTTTCCTCA 2400 
GGTGAAACTT TGTGATTTTG GTTTTGCCCG GATCATTGG A GAGAAGTCTT TCCGGAGGTC 2460 

60 AGTGGTGGGT ACCCCCGCTT ACCTGGCTCC TGAGGTCCTA AGGAACAAGG GCTACAATCG 2520 
CTCTCTAGAC ATGTGGTCTG TTGGGGTCAT CATCTATGTA AGCCTAAGCG GCACATTCCC 2580 
ATTTAATGAA GATG AAGACA TACACG ACCA AATTCAGAAT GCAGCTTTCA TGTATCCACC 2640 
AAATCCCTGG AAGGA AATAT CTCATGAAGC CATTGATCTT ATCAACAATT TGCTGCAAGT 2700 
AAAAATG AG A AAGCGCTACA GTGTGGATAA GACCTTGAGC CACCCTTGGC TACAGGACTA 2760 

65 TCAGACCTGG TTAGATTTGC GAGAGCTGG A ATGCAAAATC GGGGAGCGCT ACATCACCCA 2820 
TGAAAGTG AT GACCTG AGGT GGGAG AAGTA TGCAGGCG AG CAGCGGCTGC AGTACCCCAC 2880 
ACACCTGATC AATCCAAGTG CTAGCCAC AG TGACACTCCT GAGACTGAAG AAACAGAAAT 2940 
GAAAGCCCTC GGTGAGCGTG TCAGCATCCT CTGAGTTCCA TCTCCTATAA TCTGTCAAAA 3000 

_ CACTGTGGAA CTAATAAATA CATACGGTCA GGTTTAACAT TTGCCTTGCA GAACTGCCAT 3060 

70 TATTTTCTGT CAGATGAG AA CAA AGCTGTT AAAC TGTTAG CACTGTTGAT GTATCTGAGT 3 1 20 
TGCCAAGACA AATCAACAGA AGCATTTGTA TTTTGTGTGA CCAACTGTGT TGTATTAACA 3180 
AAAGTTCCCT GAAACACGAA ACTTGTTATT GTG AATGATT CATGTTATAT TTAATGCATT 3240 
AAACCTGTCT CCACTGTGCC TTTGCAAATC AGTGTTTTTC TTACTGGAGC TTCATTTTGG 3300 
TAAGAGACAG AATGTATCTG TGAAGTAGTT CTGTTTGGTG TGTCCCATTG GTGTTGTCAT 3360 

75 TGTAAACAAA CTCTTGAAGA GTCGATTATT TCCAGTGTTC TATGAACAAC TCCAAAACCC 3420 

ATGTGGG AAA AAAATGAATG AGGAGGGTAG GGAATAAAAT CCTAAGACAC AAATGCATGA 3480 
ACAAGTTTTA ATGTATAGTT TTGAATCCTT TGCCTGCCTG GTGTGCCTCA GTATATTTAA 3540 
ACTCAAGACA ATGCACCTAG CTGTGCAAGA CCTAGTGCTC TTAAGCCTAA ATGCCTTAGA 3600 
AATGTAAACT GCCATATATA AC AGATAC AT TTCCCTCTTT CTTATAATAC TCTGTTGTAC 3660 

352 



WO 02/30268 



PCT/USO 1/32045 



TATCG AAAAT CACCTCCTCA CCAACCTTTC ACCTTTGTGT ATTTTTCAAT AATAAAAAAT 3720 
ATTCTTGTCA AAAAAAAAAA AA 



5 tffl IP w»iWPPH4 Pwtfn ;ww« 
ProJetn Accession t: NP.002733. 1 



MSAPPVLRPP SPLLPVAAAA AAAAAALVPG SGPGPAPFLA PVAAPVGGIS FHLQIGLSRE 60 
PVLLLQDSSG DYSLAHVREM ACSIVDQKFP ECGFYGMYDK ILLFRHDPTS ENHjQLVKAA 120 
SDIQEGDUE WLSRS ATFE DFQIRPIIALF VHSYRAPAFC DHCGEMLWGL VRQGLKCEGC 1 80 
GLNYHKRCAF KiPNNCSGVR RRRLSNVSLT GVSTIRTSSA ELSTSAPDEP LLQKSPSESF 240 

15 IGREKRSNSQ S YIGRPIHLD KILMSKVKVP HTFVIHSYTR PTVCQYCKJCL LKGLFRQGLQ 300 

CKDCRFNCHK RCAPKVPNNC LGEVT1NGDL LSPG AESDVV MEEG5DOND5 ERNSGLMDDM 360 
EEAM VQDAEM AMAECQNDSG EMQDPDPDHE DANRT1SPST SNNIPLMRVV QSVKHTKRKS 420 
STVMKEGWMV HYTSKDTLRK RHYWRLDSKC ITLFQNDTGS RYYKEIPLSE ILSLEPVKTS 480 
AUPNGANPH CFETTTANVV YYVGENWNP SSPSPNNSVL TSGVGADVAR MWEIAIQHAL 540 

20 MPVIPKGSSV GTGTNLHRDI SVSISVSNCQ IQENVDISTV YQDFPDEVLG SGQFGIVYGG 600 

KHRKTGRDVA 1KHDKLRFP TKQESQLRNE VAfljQNLHHP GWNLECMFE TPERVFWME 660 
KLHGDMLEMI LSSEKGRLPE HITKFLTTQI LVALRHLHFK NIVHCDLKPE NVLLASADPF 720 
PQVKLCDFGF ARUGEKSFR RSWGTPAYL APEVLRNKGY NRSLDMWSVG VHYVSLSGT 780 
FPFNEDEDIH DQIQNAAFMY PPNPWKEISH EAIDLINNLL QVKMRKRYSV DKTLSHPWLQ 840 

25 DYQTWLDLRE LECK1GERY1 THESDDLRWE KYAGEQRLQY PTHLINPSAS HSDTPETEET 900 
EMKALGERVS IL 



30 SEO ID NO:137 PFH3 DNA SEQUENCE 

Nuclec Acid Accession t: X95425 

Coding sequence: 71 2-3825 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

35 | | | | | | 

AATGGTCAGT CAATACATTA TAACATAATA CACC AAATGC TAGAATAGAA GGGG AGGGGG 60 
GCACACATAA TGACTCACTG CTGGAAGAAG GGTGCATCAG TGAATTAAAA AATGTCCCTC 120 
CCCTCTTCAG CACTCAGCGC GCAGCTATTT CCTTCTGCCA GTCTCTTTGA ACTCTGGATC 180 
TTTGCTTTTG CTCGCTGCTC TCCTGTTTTT C ATTCTCCAC ATTTTCTCAA TCCTCTTTCT 240 

40 TTATCCTTAG CCACCCTGCT til 1 1 CCTCC TTTTTTAAAA AATCGGAGAT TTCGTCTTAA 300 
AATGATTTGT CTTCCTTACC TTCGTCCATT TCAACACTGA AGG CTGCA AA GAACTTCACC 360 
TTTCCCCTAG TGGTATTTAA AAATTCTCAA TCCGTAAAAA GTCTTTTTGA AAGGCAAAGG 420 
AACAGGACCC AGACCCTCTC GACACCCTTG ATCCGAGTCA GATCTGCACT AGCAACCAGA 480 
ACTAATATTT CATTTAACCC ACCAAAAGGG GGAGGCGAGA GGAGCCAG AA GCAAACTTC A 540 

45 TCTGTCTCAG ACGGATCCGT GGTTCCTACA TTTGGAGGAG (XGCGTGTCA GAAGGCGTAG 600 
GACCCCAAGG GGGGACAAGG AGGACTCCCG AGTCTCCCTT CTCCGCTCTC CGAGACCGAA 660 
GAGGTGG ACT GAGCCGCTCG GGACAGCGGC ACCGGAGGAG GCTCGGAGAA GATGCGGGGC 720 
TCGGGGCCCC GGGGTGCGGG ACACCGGCGG CCCCCAAGCG GCGGCGGCGA CACCCCCATC 780 
ACCCCAGCGT CCCTGGCCGG CTGCTACTCT GCACCTCGAC GGGCTCCCCT CTGGACGTGC 840 

50 CTTCTCCTGT GCGCCGCACT CCGGACCCTC CTGGCCAGCC CCAGCAACGA AGTGAATTTA 900 
TTGGATTCAC GCACTGTCAT GGGGGACCTG GGATGGATTG CTTTTCCAAA AAATGGGTGG 960 
GAAGAG ATTG GTGAAGTGGA TG AAA ATTAT GCCCCTATCC ACACATACCA AGTATGCAAA 1020 
GTGATGGAAC AGAATCAG AA TAACTGGCTT TTGACCAGTT GGATCTCCAA TG AAGGTGCT 1080 
TCCAGAATCT TCATAG AACT CAAATTTACC CTGCGGGACT GCAACAGCCT TCCTGGAGGA 1140 

55 CTGGGGACCT GTAAGGAAAC CTTTAATATG TATTACTTTG AGTCAGATGA TCAGAATGGG 1200 
AGAAACATCA AGGAAAACCA ATACATCAAA ATTGATACCA TTGCTGCCGA TGAAAGCTTT 1260 
ACAGAACTTG ATCTTGGTGA CCGTGTTATG AAACTGAATA CAGAGGTCAG AGATGTAGGA 1320 
CCTCTAAGCA AAAAGGGATT TTATCTTGCT TTTCAAGATG TTGGTGCTTG CATTGCTCTG 1380 
GTTTCTGTGC GTGTATACTA TAAAAAATGC CCTTCTGTGG TACGACACTT GGCTGTCTTC 1440 

60 CCTGACACCA TCACTGG AGC TGATTCTTCC CAATTGCTCG AAGTGTCAGG CTCCTGTGTC 1 500 
AACCATTCTG TGACCGATGA ACCTCCCAAA ATGCACTGCA GCGCCGAAGG GGAGTGGCTG 1560 
GTGCCCATCG GG AAATGC AT GTGCAAGGCA GGATATGAAG AGAAAAATGG CACCTGTCAA 1620 
GTGTGCAGAC CTGGGTTCTT CAAAGCCTCA CCTCACATCC AGAGCTGCGG CAAATGTCCA 1680 
CCTCACAGTT ATACCCATGA GGAAGCTTCA ACCTCTTGTG TCTGTGAAAA GGATTATTTC 1740 

65 AGGAGAG AGT CTGATCCACC CACAATGGCA TGCACAAGAC CCCCCTCTGC TCCTCGGAAT 1800 
GCCATCTCAA ATGTTAATGA AACTAGTGTC TTTCTGGAAT GGATTCCGCC TGCTGACACT 1860 
GGTGGAAGGA AAGACGTGTC ATATTATATT GCATGCAAGA AGTGCAACTC CCATGCAGGT 1920 
GTGTGTGAGG AGTGTGGCGG TCATGTCAGG TACCTTCCCC GGCAAAGCGC CCTGAAAAAC 1980 
ACCTCTGTCA TGATGGTGGA TCTACTCGCT CACACAAACT ATACCTTTGA GATTGAGGCA 2040 

70 GTGAATGGAG TGTCCGACTT GAGCCCAGGA GCCCGGCAGT ATGTGTCTGT AAATGTAACC 2100 
ACAAATCAAG CAGCTCCATC TCCAGTCACC AATGTGAAAA AAGGGAAAAT TGCAAA AAAC 2160 
AGCATCTCTT TGTCTTGGCA AGAACCAGAT CGTCCCAATG GAATCATCCT AG AGTATGAA 2220 
ATCAAGCATT TTGAAAAGG A CCAAG AGACC AGCTACACGA TTATCAAATC TAAAGAG ACA 2280 
ACTATTACTG CAGAGGGCTT GAA ACCAGCT TCAGTTTATG TCTTCCAAAT TCGAGCACGT 2340 

75 ACAGCAGCAG CCTATGGTGT CTTCAGTCGA AGATTTGAGT TTGAAACCAC CCCAGTGTTT 2400 
GCAGCATCCA GCGATCAAAG CCAGATTCCT GTAATTGCTG TGTCTGTGAC AGTAGGAGTC 2460 
ATTTTGTTGG CAGTGGTTAT CGGCGTCCTC CTCAGTGGAA GTTGCTGCGA ATGTGGCTGT 2520 
GGG AGGGCTT CTTCCCTGTG CGCTGTTGCC CATCCAATCC TAATATGGCG GTGTGGCTAC 2580 
AGCAAAGCAA AACAAGATCC AGAAGAGGAA AAGATGCATT TTCATAATGG GCACATTAAA 2640 
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CTGCCAGGAG TAAGAACTTA CATTGATCCA CATACCTATG AGGATCCCAA TCAAGCTGTC 2700 
CACGAATTTG CCAAGGAGAT AGAAGCATCA TGTATCACCA TTGAGAGAGT TATTGGAGCA 2760 
GGTGAATTTG GTGAAGTTTG TAGTGGACGT TTGAAACTAC CAGGAAAAAG AGAATTACCT 2820 
GTGGCTATCA AAACCCTTAA AGTAGGCTAT ACTGAAAAGC AACGCAGAGA TTTCCTAGGT 2880 
GAAGCAAGTA TCATGGGACA GTTTG ATCAT CCTA ACATCA TCCATTTAGA AGGTGTGGTG 2940 
ACCAAAAGTA AACCAGTGAT G ATCGTGACA GAGTATATGG AGAATGGCTC TTTAGATACA 3000 
TTTTTGAAGA AAAACGATGG GCAGTTCACT GTGATTCAGC TTGTTGGCAT GCTGAGAGGT 3060 
ATCTCTGCAG GAATGAAGTA CCTTTCTGAC ATGGGCTATG TGCATAGAGA TCTTGCTGCC 3120 
AGAAACATCT TAATCAACAG TAACCTTGTG TGCAAAGTGT CTGACTTTGG ACTTTCCCGG 3180 
GTACTGGAAG ATGATCCCGA GGCAGCCTAC ACCACAAGGG GAGGAAAAAT TCCAATCAGA 3240 
TGGACTGCCC CAGAAGCAAT AGCTTTCCGA AAGTTTACTT CTGCCAGTG A TGTCTGGAGT 3300 
TATGGAATAG TAATGTGGGA AGTTGTGTCT TATGGAGAGA GACCCTACTG GGAGATGACC 3360 
AATCAAGATG TGATTAAAGC GGTAGAGGAA GGCTATCGTC TGCCAAGCCC CATGG ATTGT 3420 
CCTGCTGCTC TCTATCAGTT AATGCTGGAT TGCTGGCAGA AAGAGCGAA A TAGCAGGCCC 3480 
AAGTTTGATG AAATAGTCAA CATGTTGGAC AAGCTGATAC GTAACCCAAG TAGTCTG AAG 3540 
ACGCTGGTTA ATGCATCCTG CAG AGTATCT AATTTATTGG CAGAACATAG CCCACTAGGA 3600 
TCTGGGGCCT ACAGATCAGT AGGTGAATGG CTAGAGGCAA TCAAGATGGG CCGGTATACA 3660 
GAGATTTTCA TGG AAAATGG ATACAGTTCA ATGGACGCTG TGGCTCAGGT GACCTTGGAG 3720 
GATTTGAGAC GGCTTGG AGT GACTCTTGTC GGTCACCAGA AGAAGATC AT GAACAGCCTT 3780 
CAAGAAATGA AGGTGCAGCT GGTAAACGG A ATGGTGCCAT TGTAACTTCA TGTAAATGTC 3840 
GCTTCTTCAA GTGAATGATT CTGCACTTTG TAAACAGCAC TGAGATTTAT TTTAACAAAA 3900 
AAA 



SEQ ID NO:138 PFH3 Protein sequence: 
Protein Accession #: CAA64700.1 



1 U 21 31 41 51 
I I I I I I 

MRGSGPRGAG HRRPPSGGGD TPITPASLAG CYSAPRRAPL WTCLLLCAAL RTLLASPSNE 60 
VNLLDSRTVM GDLGW1AFPK NGWEEIGEVD ENYAPIHTYQ VCKVMEQNQN NWLLTSW1SN 120 
BGASR1F1EL KFTLRDCNSL PGGLGTCKET FNMYYFESDD QNGRNIKENQ YDCIDTIAAD 180 
ESFTELDLGD RVMKLNTEVR DVGPLSKKGF YLAFQDVGAC IALVSVRVYY KKCPSWRHL 240 
AVFPDTITGA DSSQIXEVSG SCVNHSVTDE PPKMHCSAEG EWLVPIGKCM CKAGYEEKNG 300 
TCQVCRPGFF KASPHIQSCG KCPPHSYTHE EASTSCVCEK DVFRRESDPP TMACTRPPSA 360 
PRNAJSNVNE TSVFLEWIPP ADTGGRKDVS YYIACKKCNS HAGVCEECGG HVRYLPRQSG 420 
LKNTSVMMVD LLAHTNYTFE 1EAVNGVSDL SPGARQYVSV NVTTNQAAPS PVTNVKKGKI 480 
AKNS1SLSWQ EPDRPNGUL EYEIKHFEKD QETSYTOKS KETTfTAEGL KPAS VYVFQl 540 
RARTAAGYGV FSRRfEFETT PVFAASSDQS QIPVIA VSVT VGVIIXA Wl GVIXSGSCCE 600 
CGCGRASSLC AVAHPILIWR CGYSKAKQDP EEEKMHFHNG HIKLPGVRTY 1DPHTYEDPN 660 
QAVHEFAKEI EASCITIERV IGAGEPGEVC SGRLKJLPGKR ELPVAIKTLK VGYTEKQRRD 720 
FLGEASIMGQ FDHPNIIHLE G WTKSKPVM IVTEYMENGS LDTPLKKNDG QFTVIQLVGM 780 
LRG1SAGMKY LSDMGYVHRD LAARNUJNS NLVCKVSDFG LSRVLEDDPE AAYTTRGGKI 840 
PIRWTAPEA1 AFRKFTSASD VWSYGIVMWE VVSYGERPYW EMTNQDVIKA VEEGYRLPSP 900 
MDCPAALYQL MLDCWQKERN SRPKFDE1VN MLDKURNPS SLKTLVN ASC RVSNLLAEHS 960 
PLGSGAYRS V GEWLEAIKMG RYTEIFMENG YSSMDAVAQV TLEDLRRLGV TLVGHQKK1M 1020 
NSLQEMKVQL VNGMVPL 



SEQ ID NO:139 PFH2 DNA SEQUENCE 

Nudeic Acid Accession*: NM.015029 

Coding sequence: 78- 1 097 (underlined sequences correspond to start and stop codons) 

1 11 21 31 4] 51 
I t I 1 t I 

CTGCGATCCC GCAGGGCAGC GACGCGACTC TGGTGCGGGC CGTCTTCTTC CCCCCGAGCT 60 
GGGCGTGCGC GGCCGCAATg AACTGGGAGC TGCTGCTGTG GCTGCTGGTG CTGTGCGCGC 120 
TGCTCCTGCT CTTGGTGCAG CTGCTGCGCT TCCTGAGGGC TGACGGCGAC CTG ACGCTAC 1 80 
TATGGGCCGA GTGGCAGGG A CGACGCCCAG AATGGGAGCT GACTGATATG GTGGTGTGGG 240 
TGACTGGAGC CTCGAGTGGA ATTGGTGAGG AGCTGGCTTA CCAGTTGTCT AAACTAGGAG 300 
TTTCTCTTGT GCTGTCAGCC AGAAGAGTGC ATGAGCTGGA AAGGGTG AAA AG AAGATGCC 360 
TAGAG AATGG CAATTTAA AA G AAAAAG ATA TACTTGTTTT GCCCCTTG AC CTGACCG ACA 420 
CTGGTTCCCA TGAAGCGGCT ACCAAAGCTG TTCTCCAGGA GTTTGGTAGA ATCGACATTC 480 
TGGTCAACAA TGGTGG AATG TCCCAGCGTT CTCTGTGCAT GGATACCAGC TTGG ATGTCT 540 
ACAGAAAGCT AATAGAGCTT AACTACTTAG GGACGGTGTC CTTGACAAAA TGTGTTCTGC 600 
CTCACATG AT CGAGAGGAAG CAAGGAAAG A TTGTTACTGT GAATAGCATC CTGGGTATCA 660 
TATCTGTACC TCTTTCCATT GGATACTGTG CTAGCAAGCA TGCTCTCCGG GO 1 1 1 II II A 720 
ATGGCCTTCG AAC AGAACTT GCCACATACC CAGGTATAAT AGTTTCTAAC ATTTGCCCAG 780 
GACCTGTGCA ATCAAATATT GTG GAGA ATT CCCTAGCTGG AGAAGTCACA AAGACTATAG 840 
GCAATAATGG AGACCAGTCC CACAAGATGA CAACCAGTCG TTGTGTGCGG CTGATGTTAA 900 
TCAGCATGGC CAATGATTTG AAAGAAGTTT GCATCTCAGA ACAACCTTTC TTGTTAGTAA 960 
CATATTTGTG GCA ATACA TG CCAACCTGGG CCTGGTGGAT AACCAACAAG ATGGGGAAGA 1020 
AAAGGATTGA GAACTTTAAG AGTGGTGTGG ATGCAGACTC TTCTTATTTT AAAATCTTTA 1080 
AGACAAAACA TGACIQAAAA GAGCACCTGT ACM II C AAG CCACTGGAGG GAG AAA TGG A 1 1 40 
AAACATG AAA ACAGCAATCT TCTTATGCTT CTGAATAATC AAAG ACTAAT TTGTGATTTT 1200 
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ACrrriTAAT AGATATGACT TTGCTTCCAA CATGG AATGA AATAAAAAAT A AATAATA AA 1260 
AG ATTGCCAT GAATCTTGCA AA 



SEQ ID Nft140PFH2 Protein sequence 
Protein Accession #: NP.0571U1 

1 11 21 31 41 51 
I I I I I I 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLW AEWQ GRRPEWELTD MVVW VTG ASS 60 
G1GEELA YQL SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 120 
ATKAVLQEFG RIDILVNNGG MSQRSLCMDT SLDVYRKUE LNYLGTVSLT KCVLPHMER 180 
KQGKIVTVNS ILG1ISVPLS IGYCASKHAL RGFFNGLRTE LATYPGIIVS NICPGPVQSN 240 
IVENSLAGEV TKT1GNNGDQ SHKMTTSRCV RLN1L1SMAND LKEVWISEQP FLLVTYLWQY 300 
MPTWAWW1TN KMGKKR1ENF KSGVDADSSY FKIFKTKHD 



SEQ ID NO:141 PFH1 DNA SEQUENCE 

Nucleic Acid Accession*: NM_021814 

Coding sequence: 1-1 740 (underlined sequences correspond to start and stop codons) 

1 II 21 31 41 51 
I I I I I I 

ATGAGCAGCT GCAGGTACAA CGGGGGCGTC ATGCGGCCGC TCAGCAACTT GAGCGCGTCC 60 
CGCCGGAACC TGCACGAGAT GGACTCAGAG GCGCAGCCCC TGCAGCCCCC CGCGTCTGTC 120 
GGAGGAGGTG GCGGCGCGTC CTCCCCGTCT GCAGCCGCTG CCGCCGCCGC CGCTGTTTCG 180 
TCCTCAGCCC CCGAGATCGT GGTGTCTAAG CCCGAGCACA ACAACTCCAA CAACCTGGCG 240 
CTCTATGGAA CCGGCGGCGG AGGCAGCACT GGAGGAGGCG GCGGCGGTGG CGGGAGCGGG 300 
CACGGCAGCA GCAGTGGCAC C AAGTCCAGC AAAAAGAAAA ACCAGAACAT CGGCTACAAG 360 
CTGGGCCACC GGCGCGCCCT GTTCGAAAAG CGCAAGCGGC TCAGCGACTA CGCGCTCATC 420 
TTCGGCATGT TCGGCATCGT GGTCATGGTC ATCG AGACCG AGCTGTCGTG GGGCGCCTAC 480 
G ACAAGGCGT CGCTGTATTC CTTAGCTCTG AAATGCCTTA TC AGTCTCTC CACG ATCATC 540 
CTGCTCGGTC TGATCATCGT GTACCACGCC AGGGAAATAC AGTTGTTCAT GGTGG ACAAT 600 
GGAGCAGATG ACTGGAGAAT AGCCATGACT TATG AGCGTA TTTTCTTCAT CTGCTTGGAA 660 
ATACTGGTGT GTGCTATTCA TCCCATACCT GGGAATTATA CATTCACATG GACGGCCCGG 720 
CTTGCCTTCT CCTATGCCCC ATCCACAACC ACCGCTGATG TGGATATTAT TTTATCTATA 780 
CCAATGTTCT TAAGACTCTA TCTGATTGCC AG AGTCATGC TTTTACATAG CAAACTTTTC 840 
ACTGATGCCT CCTCTAGAAG CATTGGAGCA CTTAATAAGA TAAACTTCAA TACACGTTTT 900 
GTTATGAAGA CTTTAATGAC TATATGCCCA GG AACTGTAC TCTTGGTTTT TAGTATCTCA 960 
TTATGGATAA TTGCCGCATG GACTGTCCGA GCTTGTGAAA GGTACCATGA TCAACAGGAT 1020 
GTTACTAGCA ACTTCCTTGG AGCGATGTGG TTGATATCAA TAACTTTTCT CTCCATTGGT 1080 
TATGGTGACA TGGTACCTAA CACATACTGT GGAAAAGGAG TCTGCTTACT TACTGG AATT 1140 
ATGGGTGCTG GTTGCACAGC CCTGGTGGTA GCTGTAGTGG CAAGGAAGCT AGAACTTACC 1200 
AAAGCAGAAA AACACGTGCA CAATTTCATG ATGGATACTC AGCTGACTAA AAGAGTAAAA 1260 
AATGCAGCTG CCAATGTACT CAGGGAAACA TGGCTAATTT ACAAAAATAC AAAGCTAGTG 1320 
AAAAAGATAG ATCATGCAAA AGTAAGAAAA CATCAACGAA AATTCCTGCA AGCTATTCAT 1380 
CAATTAAGAA GTGTAAAAAT GGAGCAGAGG AAACTGAATG ACCAAGCAAA CACTTTGGTG 1440 
GACTTGGCAA AGACCCAGAA CATCATGTAT GATATGATTT CTGACTTAAA CGAAAGGAGT 1500 
GAAGACTTCG AGAAGAGGAT TGTTACCCTG GAAACAAAAC TAGAGACTTT GATTGGTAGC 1560 
ATCCACGCCC TCCCTGGGCT CATAAGCCAG ACCATCAGGC AGCAGCAGAG AGATTTCATT 1620 
GAGGCTCAGA TGGAGAGCTA CGACAAGCAC GTCACTTACA ATGCTGAGCG GTCCCGGTCC 1680 
TCGTCCAGGA GGCGGCGGTC CTCTTCCACA GCACCACCAA CTTCATCAGA GAGTAGCTA£ 



SEQ ID NO-,142 PFH1 Protein sequence: 
Protein Accession*: NP_OS7627 

1 11 21 31 41 51 
I I I I I 1 

MSSCRYNGGV MRPLSNLSAS RRNLHEMDSE AQPLQPPASV GGGGGASSPS AAAAAAAAVS 60 
SSAPEIWSK PEHNNSNNLA LYGTGGGGST GGGGGGGGSG HGSSSGTKSS KKKNQNIGYK 1 20 
LGHRRALFEK RKRLSDYAU FGMFG1WMV IETBLSWGAY DKASLYSLAL KCL1SLSTU 180 
LLGUIV YHA REIQLFM VDN GADDWRJAMT YERIFFICLE ILVCAIHPIP GNYTFTWTAR 240 
LAFSYAPSTT TADVDI1LSI PMFLRLYLIA RVMLLHSKLF TDASSRSIGA LNKJNFNTRF 300 
VMKTLMT1CP GTVLLVFSIS LWflAAWTVR ACERYHDQQD VTSNFLG AMW US1TFLSIG 360 
YGDMVPNTYC GKGVCLLTGI MGAGCTALVV AVVARKLELT KAEKHVHNFM MDTQLTKRVK 420 
NAAANVLRET WLIYKNTKLV KK1DHAKVRK HQRKFLQAIH QLRSVKMEQR KLNDQANTLV 480 
DLAKTQNIMY DMISDLNERS EDFEKRIVTL ETKLETLIGS IHALPGLISQ TIRQQQRDFI 540 
EAQMESYDKH VTYNAERSRS SSRRRRSSST APPTSSESS 



SEO 10 NO: 1*3 PFG9 DNA SEQUENCE 
Nucleic Add Accession*: AL1 10139, cotfng region Is FGENESH predicted 
Cooing sequence: 1-1896 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
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ATGCGCGCCG TGCCGCTGCC CGCCCCGCTC CTGCCGCTGC TGCTGCTCGC GCTCCTGGCC 60 
GCTCCCGCCG CCCGCGCCAG CAGAGCCGAG TCCGTCTCCG CGCCGTGGCC CGAACCCGAG 120 
CGCGAGTCGC GGCCACCGCC CGGCCCGGGG CCCGGGAACA CCACCCGGTT TGGGTCTGGG 180 
GCGGCGGGCG GCAGCGGCAG CTCCAGCTCC AACAGCAGTG GCGACGCCTT GGTGACCCGC 240 
ATTTCCATCC TCCTCCGCGA CCTACCCACC CTCAAGGCAG CCGTGATCGT GGCGTTCGCC 300 
TTTACCACCC TCCTCATCGC CTGCCTGCTG CTGCGCGTCT TCAGGTCGGG AAAGAGGTTA 360 
AAGAAGACAC GCAAGTATGA TATCATCACC ACTCCAGCAG AGCGAGTGGA AATGGCGCCA 420 
CTAAATGAAG AGGATGATGA AGATGAGGAC TCCACAGTAT TCGACATCAA ATACAGAGTG 480 
TCCTTGCCGG CTGCACTGAG ACGTCAGCTG CCAGGGTGCC AGACGCTACT GACAGTTCCT 540 
GTGCCCCCAC CCTTCATCCT CGACATTGAC CTTCCAGCAA GATGCAGTGG AAGGCCTG AT 600 
GGTGGAATCA GACCTGGTA A AACCTGTTTC CCAGCCTGGT GGCATCCTGT GGAAAGTTGG 660 
TCAGCTGCAA CCTGGGGTGT GAAGGACTGG ACCTGG AAGC CCTCTTGCGT CGGAGGTGTT 720 
GAAACCAAAA CGAACGTTAT GTATAAAACC CCAGCTCCAT CGTGCGTGTC AGGCATCTGC 780 
TCAGACTGTC ACTGGCAAGC TCGTTTCCAC GTCACCACAA TGGAGTTGCT TCTGCCACCC 840 
TTTGGGCATC CCTTTAAAGT GCCCCCTACT TCTACTCCCC ATGGTTTTCG ACAACTGCAG 900 
CTGAATCTCA TGGAA AAGCT GGATTCCTCT GCCTTACGCA GAAACACCCG GGCTCCATCT 960 
GCCAGGTGCT TGCCACTGGT CCTGGCAGAA ATGGCGGCTG CTGAAAGTGA CCTTCCAAAT 1020 
CCTTGGTGGC ACTTCAGCGC CACAGGCTCT CCAATAAAAA CCCTTTACAC ACA AACCATG 1080 
AGTACCTTGG GCTTGGATGT TTTCTGTGGT GCCGGCCAGC GGGGC ACCTT TTGTG AAG AC 1 140 
AGAGCAGTGA CTAAGGTTCT CCAGGGTAGC TCTTTCTCCA AACAGCTGCG CTGGAAGCCA 1200 
GCCCTAGAGA GTGGGTTTCC CCATCATCTC AGGCTTCTCA GAGAGTGTCC TCCGCTGAGC 1260 
ACCCATCCTG TCAGGTTGGC TCGTTCAG AT GCCCGGGG AC AAGCCAGCCT GACGGGGAGG 1 320 
AGGGTGTTTC GGCGTCCGCG GCAGTCTCTG CATGGCGGAG GGTCAGCGGG TACCGCAACT 1380 
TGCCTTTTGG TTTTGAAG AT TCTGTTGAGG CGCCATCCTC ACCTTGACCT CTTCTACAAA 1440 
ATCTGTCTCC CCTGCTGTGC CGTGGAACAC CTACGGGAAG CCAAGAGAAG CTCAGTGACT 1500 
GTCCTTGCGT CATTTG AGCA GAGCCCACAA AAGGCAGCTG CTGCCCACGG GGAGCCTGTC 1560 
AAACG AGGGC CCAGTGGGCA ATTGACCAG A CACACATGCC CTGGCTGGGG GATCACAC AT 1 620 
GCGAACCTGC AGACAATTCC AGATACCCAA GGCCAGGAAG GCCCACGTGA GGATGTCACT 1680 
CACCCTGGAG GAGACTTGGA TGGGGTGGCA AATTTCTATT TGGAGGAAGA GGGTTTCCAG 1740 
GATGGCAGAT GCCAG AAGAT GGTCCTGATG TCTG AGG AAG GGCCACCTAG TTTG ACAGG A 1 800 
TGTGAGAGGC TCACAGGTTC CCATCACTTC TCCAGCCATT CCAAGTCTTG GTCCTTCCTT I860 
TCCCCCCG AC AGCCCCTGTT TCTGTCC AGG CCCTGA 



SEP ID NO.-144 PFG9 Protein seouence: 

Protein Accession #: none available, FGENESH predicted 

1 11 21 31 41 51 

MRAVPLPAPL LPLLLLALLA APAARASRAE SVS APWPEPE RESRPPPGPG PGNTTRFGSG 60 
AAGGSGSSSS NSSGDALVTR ISILLRDUPT LKAAVTVAFA FTTLUACLL LRVFRSGKRL 1 20 
KKTRKYDITT TPAERVEMAP LNEEDDEDED STVFDIKYRV SLPAALRRQL PGCQTLLTVP 1 80 
VPPPFILDID LPARCSGRPD GGIRPGKTCFPAWWHPVESW SAATWGVKDW TWKPSCVGGV 240 
ETKTNVMYKT PAPSCVSGIC SDCHWQARFH VTTMELLLPP FGHPFKVPPT STPHGFRQLQ 300 
LNLMEFCLDSS ALRRNTRAPS ARCLPLVLAE MAAAESDLPN PWWHFS ATGS PIKTLYTQTM 360 
STLGLDVFCG AGQRGTFCED RAVTKVLQGS SFSKQLRWKP ALESGFPHHL RLLRECPPLS 420 
THPVRLARSD ARGQASLTGR RVFRRPRQSL HGGGS AGTAT CLLVUULLR RHPHLDLFYK 480 
ICLPCCAVEH LREAKRSSVT ViASFEQSPQ KAAAAHGEPV KRGPSGQLTR HTCPGWG1TH 540 
ANLQTIPDTQ GQEGPREDVT HPGGDLDGVA NFYLEEEGFQ DGRCQKMVLM SEEGPPSLTG 600 
CERLTGSHHF SSHSKSWSFL SPRQPLFLSR P 



SEO ID NO; 145 PFG6 DNA SEQUENCE 

Nucleic Acid Accession #: NM.013427 

Coding sequence: 875*3799 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

GGCTGGGCTG CGAATAGCGT GTTCCTCTCC GGCGGAACAC ACACACCCGG CCTTGGGGCT 60 
GTCTCCTGAA GCTCCCTCCT CCACGG AGAG CGCTGAGCGC CGCCGGGAAT TCCATCCCAC 1 20 
CGTGGGCACG CAGTCTTTGG AGGTCCCGGG CGCAGCACGC TCGGTGTCCC CACACTGCAG 180 
CAAG ACAGAG ACCCCGCGGG AACCTTGAGC TTGGAACAAC CCTTGAGCCT CTGCAGTCGG 240 
AAGAGTGGGC GCAGCAGCCC AGCGGAGGCC AGGCGCGCAA CCTCGGGCGC CGGGGCAAGG 300 
AG AG AGTGCA GGGAGGCGCA GCTCAGGCGC CCGGCTCAGG AGCGGGAGGA AGTTCTCGCG 360 
GCGCCGGG AG CGCGGTGGAC GCGCCCTGGG CGCACGCCCA GGCAGCCTTC TCCCTGGCCC 420 
TCGGG ACTGT CCTCGGGCGG CAAGGAGGAG CTTGCTGGAG TCTTAGAGGC CATCCAGAGC 480 
CAGCG AGCAG GAGCGCTGCG TCTCCCGCCT CAGCTAGG AA GGGGGAGTGG CGCTGGCAGG 540 
CTGGAGCTGG GAACCCAGCG AGCGCCTGAC CTTCCTCCTC CTCTTCCTGA CCC TCTT CGC 600 
GTCTTGGGCT CCGGAGGAAG GTTCTAGCGG CTGCAGGAGG TCCCCAG ACC CATTTTCCTA 660 
GAAGGCTGGT GATGGATCTG CTGCTCCTGC CGCCGCCGGG GCACTTGG AG CGCACCGGCG 720 
GCGCGTG AGC TGGGCTTTGC TCTCCACCGC CCTGGGCAAA CCCCGGGCCA GCCCCGCCTG 780 
GCACCTTTGC CTGAGTCCCT TTCGGTTCCC GACCCAAAGC CACCAGCGTC CAGGG AGGGA 840 
GGAGGAGGTG GTCCTCAGGT GCAGCCCCGC CGAGAJGTCC GCGCAG AGCC TGCTCCACAG 900 
CGTCTTCTCC TGTTCCTCGC CCGCTTCAAG TAGCGCGGCC TCGGCCAAGG GCTTCTCCAA 960 
GAGG AAGCTG CGCCAG ACCC GCAGCCTGG A CCOGGCCCTG ATCGGCGGCT GCGGG AGCGA 1020 
CGAGGCGGGC GCGGAGGGCA GTGCGCGGGG AGCCACGGCG GCCCGCCTCT ACTCCCCATC 1080 
ACTCCCAGCC GAGAGTCTCG GCCCTCGCTT GGCGTCCTCT TCCCGGGGTC CGCCCCCCAG 1140 
GGCCACCAGG CTACCGCCTC CTGGACCTCT TTGCTCGTCC TTCTCCACAC CCAGCACCCC 1200 
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GCAGG AGAAG TCACCATCCG GCAGCTTTCA CTTTGACTAT GAGGTTCCCC TGGGTCGCGG 1260 
CGGCCTCAAG AAG AGCATGG CCTGGGACCT GCCTTCTGTC CTGGCCGGGC CAGCCAGTAG 1 320 
CCGAAGCGCT TCCAGCATCC TCTGTTCATC CGGGGGAGGC CCCAATGGCA TCTTCGCTTC 1380 
TCCTAGGAGG TGGCTCCAGC AGAGG AAGTT CCAGTCCCCA CCCGACAGTC GCGGGCACCC 1440 
CTACGTCGTG TGGAAATCCG AGGGTGATTT CACCTGGAAC AGCATGTCAG GCCGCAGTGT 1500 
GCGGCTGAGG TCAGTCCCCA TCCAGAGTCT CTCAGAGCTG GAGAGGGCCC GGCTGCAGGA 1560 
AGTGCCTTTT TATCAGTTGC AACAGGACTG TGACCTGAGC TGTCAGATCA CCATTCCCA A 1620 
AGATGGACAA AAG AG A AAG A AATCTTTAAG AAAGAAACTG GATTCACTAG GAAAGGAGAA 1680 
AAACAAAGAC AAAGAATTCA TCCCACAGGC ATTTGGAATG CCCTTATCCC AAGTCATTGC 1740 
GAATGACAGG GCCTATAAAC TCAAGCAGGA CTTGC AGAGG GACGAGCAGA AAGATGCATC 1800 
TGACTTTGTG GCTTCCCTCC TCCCATTTGG AAATAAAAGA CAAAACAAAG AACTCTCAAG 1860 
CAGTAACTCA TCTCTCAGCT CAACCTCAG A A ACACCG AAT GAGTCAACGT CCCCAA ACAC 1 920 
CCCGGAACCG GCTCCTCGGG CTAGGAGGAG GGGTGCCAT G TCA GTGGATT CTATCACCGA 1980 
TCTTGATGAC AATCAGTCTC GACTACTAGA AGCTTTACAA CTTTCCTTGC CTGCTGAGGC 2040 
TCAAAGTAAA AAGGAAAAAG CCAGAGATAA GAAACTCAGT CTGAATCCTA TTTACAGACA 2100 
GGTCCCTAGG CTGGTGG ACA GCTGCTGTCA GCACCTAGAA AAACATGGCC TCCAGACAGT 2160 
GGGGATATTC CGAGTTGGAA GCTCAAAAAA GAGAGTGAGA CAATTACGTG AGGAATTTGA 2220 
CCGTGGGATT GATGTCTCTC TGGAGGAGGA GCACAGTGTT CATGATGTGG CAGCCTTCCT 2280 
GAAAGAGTTC CTGAGGG ACA TGCCAGACCC CCTTCTCACC AGGGAGCTGT ACACAGCTTT 2340 
CATC AACACT CTCTTGTTGG AGCCGGAGGA ACAGCTGGGC ACCTTGCAGC TCCTCATATA 2400 
CCTTCTACCT CCCTGCAACT GCGACACCCT CCACCGCCTG CTACAGTTCC TCTCCATCGT 2460 
GGCCAGGCAT GCCGATGACA ACATCAGCAA AGATGGGCAA GAGGTCACTG GGAATAAAAT 2520 
GACATCTCTA AACTTAGCCA CCATATTTGG ACCCAACCTG CTGCACAAGC AGAAGTCATC 2580 
AGACAAAGAA TTCTCAGTTC AGAGTTCAGC CCGGGCTGAG GAGAGCACGG CCATCATCGC 2640 
TGTTGTGCAA AAGATGATTG AAAATTATGA AGCCCTGTTC ATGGTTCCCC CAGATCTCCA 2700 
GAACG AAGTG CTGATCAGCC TGTTAGAGAC CGATCCTGAT GTCGTGGACT A TTTAC TCAG 2760 
AAGAAAGGCT TCCCAATC AT CAAGCCCTGA CATGCTGCAG TCGGAAGTTT CCTTTTCCGT 2820 
GGGAGGGAGG CATTCATCTA CAG ACTCCAA CAAGGCCTCC AGCGGAGACA TCTCCCCTTA 2880 
TGACAACAAC TCCCCAGTGC TGTCTGAGCG CTCCCTGCTG GCTATGCAAG AGGACGCGGC 2940 
CCCGGGGGGC TCGG AGAAGC TTTACAGAGT GCCAGGGCAG TTTATGCTGG TGGGCCACTT 3000 
GTCGTCGTCA AAGTCAAGGG AAAGTTCTCC TGG ACCAAGG CTTGGGAAAG ATCTGTCAGA 3060 
GGAGCCTTTC GATATCTGGG GAACTTGGCA TTCAACATTA AAAAGCGGAT CCAAAGACCC 31 20 
AGGAATGACA GGTTCCTCTG GAG ACATTTT TGAAAGCAGC TCCCTAAGAG CGGGGCCCTG 31 80 
CTCCCTTTCT CAAGGGAACC TGTCCCCAAA TTGGCCTCGG TGGCAGGGGA GCCCCGCAGA 3240 
GCTGGACAGC G ACACGCAGG GGGCTCGGAG GACTCAGGCC GCAGCCCCCG CGACGG AGGG 3300 
CAGGGCCCAC CCTGCGGTGT CGCGCGCCTG CAGCACGCCC CACGTCCAGG TGGCAGGG AA 3360 
AGCCGAGCGG CCCACGGCCA GGTCGGAGCA GTACTTGACC CTGAGCGGCG CCCACG ACCT 3420 
CAGCGAGAGT GAGCTGG ATG TGGCCGGGCT GCAGAGCCGG GCCACACCTC AGTGCCAAAG 3480 
ACCCCATGGG AGTGGGAGGG ATGACAAGCG GCCCCCGCCT CCATACCCGG GCCCAGGGAA 3540 
GCCCGCGGCA GCGGCAGCCT GGATCCAGGG GCCCCCGGAA GGCGTGG AGA CACCCACGGA 3600 
CCAGGGAGGC CAAGCAGCCG AGCGAGAGCA GCAGGTCACG CAG AAAAAAC TGAGCAGCGC 3660 
CAACTCCCTG CCAGCGGGCG AGCAGGACAG TCCGCGCCTG GGGGACGCTG GCTGGCTCGA 3720 
CTGGCAGAG A GAGCGCTGGC AGATCTGGGA GCTCCTGTCG ACCGACAACC CCGATGCCCT 3780 
GCCCGAGACG CTGGT CTGA G CCCGCACCCA GCCGAGCCCC CCCTGCCCCG AGCCCCCCGC 3840 
CCTCCAGCCC AGGGGGGACC GTGGGTGGTG GCCACTGGCA CACTTAQTGT TCTTCTTTCA 3900 
CACTTCTCAA AAGTG ACACA AGAGAAATCC AGTTCACCTA CAG AGGTAG A GCACTCACGC 3960 
CCCCGCCATT G AG AATAAGG TTCCATTGCG TAGCCAGCCT TAGG AAAAAC AAACAGAACC 4020 
CAAACCAGAT GGCAATGTCC AATCTAAAAA CGTCCCTCTT GGCTCTATAA TATAAGATAC 4080 
AACTCTTGCT TGGTATAGCC TAACCGTATT TATGTGTCTT CGGTTTTGAC TATTGTGTAT 4140 
TCTGTAACAG ATTATGTATA ATCATATATG ATATATTCAC AAAGAGAAAA CAAAAGGAAC 4200 
TTTTAAAAAA AAAATCACTT CACTTATATT AAGCAATGAG ATATA CTAAA C AATGA GATT 4260 
CTATAGAATG TTCTAGAATG TGCACAAGCG GGTTTCTGTG CTTTTGCCAT AGCTTTATAA 4320 
CTGGGGATAA CCCTTCCTTC GATACCAAAC ACTAACAAGA GGAAGCAGAA TATGAGAAGC 4380 
CATATTTTTA CATAGG AGTC AGATACAAAA AGAAAAATCA CTGAA TGCTT TTAGATATTG 4440 
AATACGTTTT CAGGAAAATG CTAAATCTGA TAGATTACGA AATATATTTT TAG AACTTGT 4500 
TTAGAAAGGA TTCAGTTAAC CAAACAAGAA AAAGGCAGTG CCTCACAAAG AAATTAAGAA 4560 
GTTGTCCGTC CCACGTTACA TCAAATTCAG TTTTATATAG GCCATATATA ATATATATTT 4620 
ATAATGTATA A TM TIA TGT ATTTTTCAAA ACTACAAACT GGAAT CCAAC TATAAAGTGT 4680 
TTAAG AATCT ACACAGAATA TTCAAATTAT AGAACATGTT TTTTCCCTTT GCCCCATA AT 4740 
CAGTATTTGC CAAATTACAT GCAATTCCTT AAAAACTAAA TCA CATT GGT AAAAGG CCTA 4800 
CAGCTTTGTA CTTACATTGT GCCAAAGGCT GAGGAAATGT TTTCTTTCG A ATTTTTATGT 4860 
GTATTGTAAA ATGTTCTACC GTACTTTAGT AGTTTGA AGT TTTCAAGTGC ATAACTATTT 4920 
TTGACCAGCA GAAGGCG ATA CGCTTCAGTA TTTTATGCAA TTTTTTTTCA CTTCGAAGGG 4980 
AAAGTGTATT ATAAAAAAAG ATI 1 1 1 1 1 1 1 TTTAAAACAT GCTACTCTTA ATTTTCATGT 5040 
TGGTGATGAA ATTCCCAGTG GTGTTTCTTA AGGTTCTATC TTGTGCCATG ATGAATAAAA 5100 
AGTTAAGCAA AAAAAAAAAA AAAAAAAAAA AAA 



SEQ ID HO:146 PFG 6 Protein semience: 
Protein Accession!: NPJB8286.1 

1 II 21 31 41 51 

MSAQSLLHSV FSCSSPASSS AASAKGFSKR KLRQTRSLDP AUGGCGSDE AGAEGSARGA 60 
TAGRLYSPSL PAESU5PRLA SSSRGPPPRA TRLPPPGPLC SSFSTPSTPQ EK5PSGSFHF 120 
DYEVPLGRGG LKKSMAWDLP SVLAGPASSR SASSILCSSG GGPNGIFASP RRWLQQRKFQ 180 
SPPDSRGHPY WWKSEGDFT WNSMSGRSVR LRSVPIQSLS ELERARLQEV PFYQLQQDCD 240 
LSCQIT1PKD GQKRKKSLRK KLDSLGKEKN KDKEFTPQAF GMPLSQV1AN DRAYKLKQDL 300 
QRDEQKDASD FVASLLPFGN KRQNKELSSS NSSLSSTSET PNESTSPNTP EPAPRARRRG 360 
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AMSVDSrTDL DDNQSRLLEA LQLSLPAEAQ SKKEKARDKK LSLNPIYRQV PRLVDSCCQH 420 
LEKHGLQTVG IFRVGSSKKR VRQLREEFDR GIDVSLEEEH SV1IDVAALLK EFLRDMPDPL 480 
LTRELYTAH NTLLLEPEEQ LGTLQLLTYL LPPCNCDTLH RLLQFLSIVA RHADDNI5KD 540 
GQEVTGNKMT SLNLATIFGP NLLHKQKSSD KEFSVQSSAR AEESTAHAV VQKMIENYEA 600 
LFMVPPDLQN BVUSLLETD PDVVDYLLRR KASQSSSPDM LQSEVSFSVG GRHSSTDSNK 660 
ASSGDISPYD NNSPVLSERS LLAMQEDAAP GGSEKLYRVP GQFMLVGHLS SSKSRESSPG 720 
PRLGKDLSEE PFDIWGTWHS TLKSGSKDPG MTGSSGD1FE S5SLRAGPCS LSQGNLSPNW 780 
PRWQGSPAEL DSDTQGARRT QAAAPATEGR AHPAVSRACS TPHVQVAGKA ERPTARSEQY 840 
LTLSGAHDLS ESELDVAGLQ SRATPQCQRP HGSGRDDKRP PPPYPGPGKP AAAAAWIQGP 900 
PEGVETPTDQ GGQAAEREQQ VTQKKLSSAN SLPAGEQDSP RLGDAGWLDW QRERWQIWEL 960 
LSTDNPDALP ETLV 



SEQ ID NO:147 PFG4 DNA SEQUENCE 

Nucleic Acid Accession #: NM.002202 

Coding sequence: 240-1289 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I I I I I 

CCCCCGAGCC GCGCCGAGTC TGCCGCCGCC GCAGCGCCTC CGCTCCGCCA ACTCCGCCGG 60 
CTTAAATTGG ACTCCTAGAT CCGCGAGGGC GCGGCGCAGC CGAGCAGCGG CTCTTTCAGC 120 
ATTGGCAACC CCAGGGGCCA ATATTTCCCA CTTAGCCACA GCTCCAGCAT CCTCTCTGTG 180 
GGCTGTTCAC CAACTGTACA ACCACCATTT C ACTGTGG AC ATTACTCCCT CTTACAG ATA 240 
TGGGAGACAT GGGAGATCCA CCAAAAAAAA AACGTCTGAT TTCCCTATGT GTTGGTTGCG 300 
GCAATCAGAT TCACGATCAG TATATTCTG A GGGTTTCTCC GGATTTGGAA TGGCATGCGG 360 
CATGTTTGAA ATGTGCGGAG TGTAATCAGT ATTTGGACGA GAGCTGTACA TGCTTTGTTA 420 
GGGATGGGAA AACCTACTGT AAAAGAGATT ATATCAGGTT GTACGGGATC AAATGCGCCA 480 
AGTGCAGCAT CGGCTTCAGC AAGAACG ACT TCGTG ATGCG TGCCCGCTCC AAGGTGTATC 540 
ACATCGAGTG TTTCCGCTGT GTGGCCTGCA GCCGCCAGCT CATCCCTGGG GACGAATTTG 600 
CGCTTCGGGA GGACGGTCTC TTCTGCCGAG CAGACCACGA TGTGGTGGAG AGGGCCAGTC 660 
TAGGCGCTGG CGACCCGCTC AGTCCCCTGC ATCCAGCGCG GCCACTGCAA ATGGCAGCGG 720 
AGCCCATCTC CGCCAGGCAG CCAGCCCTGC GGCCCCACGT CCACAAGCAG CCGGAGAAGA 780 
CCACCCGCGT GCGGACTGTG CTGAACG AG A AGCAGCTGCA CACCTTGCGG ACCTGCTACG 840 
CCGCAAACCC GCGGCCAGAT GCGCTCATGA AGGAGCAACT GGTAGAGATG ACGGGCCTCA 900 
GTCCCCGTGT GATCCGGGTC TGGTTTCAAA ACAAGCGGTG CAAGGACAAG AAGCGAAGCA 960 
TCATGATGAA GCAACTCCAG CAGCAGCAGC CCAATG ACAA AACTAATATC CAGGGGATGA 1020 
CAGGAACTCC CATGGTGGCT GCCAGTCCAG AGAGACACGA CGGTGGCTTA CAGGCTAACC 1080 
CAGTGGAAGT ACAAAGTTAC CAGCCACCTT GGAAAGTACT GAGCGACTTC GCCTTGCAG A 1 140 
GTGACATAGA TCAGCCTGCT TTTCAGCAAC TGGTCAATTT TTCAGAAGGA GGACCGGGCT 1200 
CTAATTCCAC TGGCAGTGAA GTAGCATCAA TGTCCTCTCA ACTTCCAGAT ACACCTAACA 1260 
GCATGGTAGC CAGTCCTATT GAGGC ATGA G GAACATTCAT TCTGTATTTT TTTTCCCTGT 1320 
TGGAGAAAGT GGGAAATTAT AATGTCGAAC TCTGAAACAA AAGTATTTAA CGACCCAGTC 1380 
AATGAAAACT GAATCAAGAA ATGAATGCTC CATGAAATGC ACGAAGTCTG TTTTAATGAC 1440 
AAGGTGATAT GGTAGCAACA CTGTGAAGAC AATCATGGGA TTTTACTAGA ATTAAACAAC 1500 
AAACAAAACG CAAAACCCAG TATATGCTAT TC AATG ATCT TAG AAGTACT GAAAAAAAAA 1560 
GAC OTTTTTA AAACGTAGAG GATTTATATT CAAGGATCTC AAAGAAAGCA TTTTCATTTC 1620 
ACTGCACATC TAGAGAAAAA CAAAAATAGA AAATTTTCTA GTCCATCCTA ATCTGAATGG 1680 
TGCTGTTTCT ATATTGGTCA TTGCCTTGCC AAACAGGAGC TCCAGCAAAA GCGCAGGAAG 1740 
AG AGACTGGC CTCCTTGGCT GAAAGAGTCC TTTCAGGAAG GTGGAGCTGC ATTGGTTTGA 1800 
TATGTTTAA A GTTG ACTTTA AC AAGGGGTT AATTG AAATC CTGGGTCTCT TGGCCTGTCC 1860 
TGTAGCTGGT TTATTTTTTA CTTTGCCCCC TCCCCACTTT TTTTGAGATC CATCCTTTAT 1920 
CAAGAAGTCT GAAGCGACTA TAAAGG 1111 TGAATTCAG A TTTAAAAACC AACTTATAAA 1 980 
GCATTGCAAC AAGGTTACCT CTATTTTGCC ACAAGCGTCT CGGGATTGTG TTTGACTTGT 2040 
GTCTGTCCAA GAACTTTTCC CCCAAAGATG TGTATAGTTA TTGGTTAAAA TGACTGTTTT 2100 
CTCTCTCTAT GG AAATA AAA AGG AAAAAAA AAAGG AAACT TTTTTTGTTT GCTCTTGCAT 2160 
TGCAAAAATT ATAAAGTAAT TTATTATTTA TTGTCGGAAG ACTTGCCACT TTTCATGTCA 2220 
TTTGACATTT TTTGTTTGCT GAAGTGAAAA AA AAAGATAA AGGTTGTACG GTGGTCTTTG 2280 
A ATTATATGT CTAATTCTAT GTGTTTTGTC TTTTTCTTAA ATATTATGTG AAATCAAAGC 2340 
GCCATATGTA GA ATTATATC TTCAGGACTA TTTCACTAAT AAACATTTGG CATAGAT 



SEQ ID NO:148 PFG4 Protein secuencer 
Protein Accession*: NPJJ02193.1 

I II 21 31 41 51 
I I I I I I 

MGDPPKXKRL ISLCVGCGNQ 1HDQYILRVS PDLEWHAACL KCAECNQYLD ESCTCFVRDG 60 
KTYCKRDYIR LYGIKCAKCS IGFSKNDFVM RARSKVYHIE CFRCVACSRQ UPGDEFALR 120 
EDGLFCRADH DWERASLGA GDPLSPLHPA RPLQMAAEPI SARQPALRPH VHKQPEKTTR 180 
VRTVLNEKQL HTLRTCYAAN PRPDALMKEQ LVEMTGLSPR VIRVWFQNKR CKDKKRSIMM 240 
KQLQQQQPND KTNIQGMTGT PMVAASPERH DGGLQANPVE VQSYQPPWKV LSDFAUQSDl 300 
DQPAFQQLVN FSEGGPGSNS TGSEVASMSS QLPDTPNSMV ASPIEA 
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SEO ID NO: 149 PFG2 DNA SEQUENCE 

Nudeta Add Accession •: NMJJ01172 

Cocfing sequence: ' 39-1103 (undefined sequences conespond lo start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GCGGAGCTCT GCCTTGG AG A TTCTCAGTGC TGCGG ATCAXGTCCCT AAGG GGCAGCCTCT 60 
CGCGTCTCCT CCAGACGCGA GTGCATTCCA TCCTGAAGAA ATCCGTCCAC TCCGTGGCTG 1 20 
TGATAGGAGC CCCGTTCTCA CAAGGGCAGA AAAGAAAAGG AGTGGAGCAT GGTCCCGCTG 180 
CCATAAG AGA AGCTGGCTTG ATGAAAAGGC TCTCCAGTTT GGGCTGCCAC CTAAAAGACT 240 
TTGGAGATTT GAGTTTTACT CCAGTCCCCA AAGATGATCT CTACAACAAC CTGATAGTGA 300 
ATCCACGCTC AGTGGGTCTT GCCAACCAGG AACTGGCTGA GGTGGTTAGC AGAGCTGTGT 360 
CAGATGGCTA CAGCT G TGTC ACACTGGGAG GAGACCACAG CCTGGCA ATC GGTACCATTA 420 
GTGGCCATGC CCGACACTGC CCAGACCTTT GTGTTGTCTG GGTTGATGCC CATGCTGACA 480 
TCAACACACC CCTTACCACT TCATCAGGAA ATCTCCATGG ACAGCCAGTT TCATTTCTCC 540 
TCAGAGAACT AC AGG ATAAG GTACCACAAC TCCCAGGATT TTCCTGGATC AAACCTTGTA 600 
TCTCTTCTGC AAGTATTGTG TATATTGGTC TGAGAGACGT GGACCCTCCT GAACATTTTA 660 
TTTTAAAGAA CTATGATATC CAGTATTTTT CCATGAGAGA TATTGATCG A CTTGGTATCC 720 
AGAAGGTCAT GGAACGAACA TTTGATCTGC TGATTGGCAA G AGACAAAG A CCAATCCATT 780 
TGAGTTTTGA TATTGATGCA TTTGACCCTA CACTGGCTCC AGCCACAGGA ACTCCTGTTG 840 
TCGGGGGACT AACCTATCGA G AAGGCATGT ATATTGCTGA GGAAATACAC AATACAGGGT 900 
TGCTATCAGC ACTGG ATCTT GTTGAAGTCA ATCCTCAGTT GGCCACCTCA G AGG AAGAGG 960 
CGAAGACTAC AGCTAACCTG GCAGTAGATG TGATTGCTTC AAGCTTTGGT CAGACAAGAG 1020 
AAGGAGGGCA TATTGTCTAT GACCAACTTC CTACTCCCAG TTCACCAGAT GAATCAGAAA 1080 
ATCAAGCACG TGTGAGA ATT TAG GAGACAC TGTGCACTGA CATGTTTCAC AACAGGCATT 1140 
CCAGAATTAT GAGGCATTGA GGGGATAGAT GAATACTAAA TGGTTGTCTG GGTCAATACT 1200 
GCCTTAATGA GAACATTTAC ACATTCTCAC AATTGTAAAG TTTCCCCTCT ATTTTGGTGA 1260 
CCAATACTAC TGTAAATGTA TTTGGTTTTT TGCAGTTCAC AGGGTATTAA TATGCTACAG 1 320 
TACTATGTAA ATTTAAAGAA GTCATAAACA GCATTTATTA CCTTGGTATA TCATACTGGT 1380 
CTTGTTGCTG TTGTTCCTTC ACATTTAAGT GGTTTTTCAT CTTTCCTCCC TCCTCCCACA 1440 
GCCTGGCTAT ACAGTGCATC CTTGAACTGT CAGCCCACAG CAGCAATATG CTTATTCTAT 1500 
CCACATCCCT AACATCATGC ATTCACAAGG TCAAAGTTCT GGTCCACAAA CCCTTCCCTA 1560 
TAGAAGTTCA ATGGCTGCGA AAGAATTTGT AGTAAACCAG GCCTCCCAGG ATGGCGAGCT 1620 
CCAGTAAG AT GATAATGG AA AGCACCAGCT TGTTGGTTGT CACTCTACAA AG AG AAGCAA 1680 
AGTGGGGAGT AGTCAGAAGT TTGGATAACC TTCCTTCTAA ACATTTGGGG GTTAGACCTG 1740 
GGACCACGGC TGGATACTCT GAGGCTGTAT GTTTGATCAC AC AGCCACTT AGCAGG AAGT 1 800 
ACTCATAAGG TTCTTTAGCT GTCACTTAGG G ATAACACTG TCTACCTCAC AG AAATGTTA 1 860 
AACTGAGACA ATAAAACCCA AAGCAT 



SEQ ID NOrlSO PFG2 Protein sequence: 
Protein Accessions: NP.001 163.1 

1 11 21 31 41 51 
I I I I. I I 

MSLRGSLSRL LQTRVHSILK KSVHSVAVIG APFSQGQKRK GVEHGPAAIR EAGLMKRLSS 60 
LGCHLKDPGD LSFTPVPKDD LYNNUVNPR SVGLANQELA EVVSRAVSDG YSCVTLGGDH 120 
SLAIGT1SGH ARHCPDLCVV WVDAHADINT PLTTSSGNLH GQPVSFLLRE LQDKVPQLPG 180 
FSWIKPCISS ASIVY1GLRD VDPPEHFILK NYDIQYFSMR DIDRLGIQKV MERTFDLLIG 240 
KRQRPIHLSF DIDAFDPTLA PATGTPWGG LTYREGMYIA EEIHNTGLLS ALDLVEVNPQ 300 
LATSEEEAKT TANLAVDV1A SSFGQTREGG HIVYDQLPTP SSPDESENQA RVR1 



SEO ID N0:151 PFG1 DNA SEQUENCE 

Nucleic Add Accessions: NM.017906 

Cocfing sequence: 80-1255 (undefined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
111(11 

AATTATATAT TTTTACTCTA TGTTTCTCTA CATGTTTTTT TCTTTCCGTT GCTGGCGG AA 60 
GAGGCACGTG CGCTGCTG AA TGG AGCTGGT CGCTGGTTGC TACG AGCAGG TCCTCTTTGG 120 
GTTCGCTGTA CACCCGG AGC CCAAGGCTTG CGGCGACCAC GAGCAATGGA CTCTTGTGGC 1 80 
TGACTTCACT CACCATGCTC AC ACTGCCTC CTTGTCAGCA GTAGCTGTAA ATAGTCGTTT 240 
TGTGGTCACT GGGAGC AAAG ATGAAACAAT TCACATTTAT GACATG AAAA AG AAGATTGA 300 
GCATGGGGCT CTAGTGCATC ACAGTGGTAC AATAACTTGC CTGAAATTCT ATGGCAACAG 360 
GCATTTAATC AGTGG AGCGG AAGATGGACT CATCTGTATC TGGGATGCAA AGA AATGGGA 420 
ATGCCTGAAG TCAATTAAAG CTCACAAAGG ACAGGTGACC TTCCTTTCTA TTCACCCATC 480 
TGGCAAGTTG GCCCTGTCGG TTGGTACAGA TAAAACTTTA AG AACGTGGA ATCTTGTAGA 540 
AGGAAGATCA GCATTCATAA AAAATATAAA ACA AAATGCT CACATAGTAG AATGGTCCCC 600 
AAGAGGAGAG CAGTATGTAG TTATCATACA GAATAAAATA G ACATCTATC AGCTTGACAC 660 
TGCATCCATT AGTGCCACCA TCACAAATGA AAAG AGAATT TCC TCTGT TA AATTTCTTTC 720 
AGAGTCTGTC CTTGCAGTGG CTGGAGATGA AGAAGTTATA AGGTTTTTTG ACTGTGATTC 780 
ACTAGTGTGC CTCTGCG AAT TTAAAGCTCA TGAAAACAGG GTAAAGGACA TGTTCAGTTT 840 
TG AAATTCCA GAGCATCATG TTATTGTTTC AGCATCGAGT GATGGTTTCA TCAAAATGTG 900 
GAAGCTTAAG CAGG ATAAG A AAGTTCCCCC ATCTTTACTC TGTGAAATAA ACACTAATGC 960 
CAGGCTGACG TGTCTTGGAG TGTGGCTAGA CAAAGTGGCA GACATGAAAA CCCTTCCTCC 1020 
AGCTGCAGAG CCTTCTCCTG TAAGTAAAGA ACAGTCCAAA ATTGGCAAAA AGGAGCCTGG 1080 
TGACACAGTG CACAAAG AAG AAAAGCGGTC AAAACCTAAC ACAAAG AAAC GCGGTTTAAC 1140 
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PCT/USO 1/32045 



10 



25 



AGCTGACAGT AAGAAAGCAA CAAAAGAAAG TGGCCTGATA TCAACCAAGA AGAGGAAAAT 1200 
GGTAGAAATG TTGGAAAAGA AGAGGAAAAA GAAGAAAATA AAAACAATGC A GTGA ATCAC 1260 
AGATGTCT CC TGAAAGAACT CTTTTAGATG AAATCATTCT ACTCAAATGT ACCTTAATTT 1320 
1 1 1 1 1 1 1 ICC CTGAGTAAAA GCAAGAAATT TCTTCCTTTG GAAAAAATAT ATATATTAAA 1 380 
A AACCACTTT TAG ATGGTTT TTTTTA AAAA AAAA AAAA AA ACTGGTA AAA TTACTTTTGG 1 440 
C AGACA GTGT TTTATG AATT ATGTATCATG TTGATATATA ATATGTTAAT GTGTCATGTA 1500 
ATTTTTACTT TGTACAAAGC AAATAAAG AT CTTTCTCAAA AAAAAAA AAA AAAA 



SEQ IP Nft1$2 PPG1 Protein sequence; 
Protein Accession!: NPJJ60376.1 



1 11 21 31 41 51 

15 | | | | | | 

MELVAGCYEQ VLFGFAVHPE PKACGDHEQW TLVADFTHIIA HTASLSAVAV NSRFVVTGSK 60 
DETBflYDMK KKIEHGALVH HSGTTTCLKF YGNRHLISGA EDGUCIWDA KKWECLKSIK 120 
AHKGQVTFLS IHPSGKLAL5 VGTDKTLRTW NLVEGRSAF1 KNIKQNAHIV EWSPRGEQYV 180 
VUQNKJDIY QLDTASISGT ITNEKR1SSV KFLSESVLA V AGDEEVIRFF DCDSLVCLCE 240 

20 FKAHENRVKD MFSFEIPEHH VIVSASSDGF IKMWKLKQDK KVPPSLLCH NTNARLTCLG 300 

VWLDKVADMK SLPPAAEPSP VSKEQSK1GK KEPGDTVHKE EKRSKPNTKK RGLTGDSKKA 360 
TKESGLISTK KRKMVEMLEK KRKKKK1KTM Q 



SEQ 10 NO:153 PF06 DNA SEQUENCE 

Nucleic Acid Accession #: NM.014668 

Coding sequence: 1 10-2953 (underlined sequences correspond to start and stop codons) 



30 I 11 21 31 41 51 
I I I 1 I I 

GATGTCTTGG ACATGCTCTG GCTGGCTAAT CTCCATGTTC TAGCCG ACTG AA A ATACGGT 60 
GGCCAAGTGG ATGGTGTGCT TATTTGCAGT CTAAAGAAAT TTCCTTTTGA_TG_TGGCAGAA 1 20 
A ATCGAGG AT GTGG AGTGG A GACCCCAG AC TTACTTGGAG CTGG AGGGTC TGCCTTGCAT 1 80 

3 5 CCTGATCTTC AGTGGG ATGG ACCCGCATGG GGAGTCCTTG CCGAGGTCTT TGAGGTACTG 240 
TGACCTGCGA TTGATAAACT CCTCCTGCTT GGTGAGAACA GCCTTGG AGC AGGAGCTGGG 300 
CCTGGCTGCC TACTTTGTG A GCAACG AGGT TCCCTTGGAG AAGGGGGCTA GGAACGAGGC 360 
CTTGGAGAGT GATGCTGAGA AGCTGAGCAG CACAG ACAAC GAGGATGAGG AGCTGGGGAC 420 
AGAAGGCTCT ACCTCGGAG A AG AGAAGCCC CATGAAAAGG GAGAGGTCCC GCTCCCACGA 480 

40 CTCAGCATCC TCATCCCTCT CCTCCAAGGC TTCCGGTTCA GCGCTCGGTG GCGAGTCCTC 540 
GGCTCAGCCC ACAGCACTCC CCCAGGGAGA GCATGCCAGG TCGCCCCAGC CCCGTGGCCC 600 
CGCAGAGGAG GGCAGAGCCC CTGGTGAGAA ACAGAGGCCC CGGGCAAGTC AGGGGCCACC 660 
CTCGGCCATC AGCAGGCACA GTCCCGGGCC GACGCCCCAG CCCGACTGTA GCCTCAGGAC 720 
CGGCCAGAGG AGCGTCCAGG TGTCGGTCAC CTCGTCGTGC TCCCAGCTGT CCTCCTCCTC 780 

45 GGGCTCATCC TCCTC ATCCG TGGCGCCCGC TGCCGGCACG TGGGTCCTGC AGGCCTCCCA 840 
GTGCTCCTTG ACCAAGGCCT GCCGCCAGCC ACCCATTGTC TTCTTGCCCA AGCTCGTGTA 900 
CGACATGGTT GTGTCCACTG ACAGCAGTGG CCTGCCCAAG GCCGCCTCCC TCCTGCCCTC 960 
CCCCTCGGTC ATGTGGGCCA GCTCTTTCCG CCCCCTGCTC AGCAAGACCA TGACATCCAC 1020 

_ _ CGAGCAGTCC CTCTACTACC GGCAGTGGAC GGTGCCCCGG CCCAGCCACA TGGACTACGG 1080 

50 CAACCGGGCC GAGGGCCGCG TGG ACGGCTT CCACCCCCGC AGGCTGCTGC TC AGCGGCCC 1140 
CCCTCAG ATC GGGAAGACAG GTGCCTACCT GCAGTTCCTC AGTGTCCTGT CCAGG ATGCT 1 200 
TGTTCGGCTC ACAGAAGTGG ATGTCTATGA CGAGGAGGAG ATCAATATCA ACCTCAGAGA 1260 
AGAATCTGAC TGGCATTATC TCCAGCTTAG CGACCCCTGG CCAGACCTGG AGCTGTTCAA 1320 
GAAGTTGCCC TTTGACTACA TCATTCACGA CCCGAAGTAT GAAGATGCCA GCCTGATTTG 1380 

55 TTCGCACTAT CAGGGTATAA AG AGTGAAGA CAGAGGGATG TCCCGGAAGC CGGAGGACCT 1440 
TTATGTGCGG CGTCAGACGG CACGGATGAG ACTGTCCAAG TACGCAGCGT ACAACACTTA 1500 
CCACCACTGT GAGCAGTGCC ACCAGTACAT GGGCTTCCAC CCCCGCTACC AGCTGTATG A 1560 
GTCCACCCTG CACGCCTTTG CCTTCTCTTA CTCCATGCTA GGAGAGGAGA TCCAGCTGCA 1620 

, _ CTTCATCATC CCCAAGTCCA AGGAGCACCA CTTTGTCTTC AGCCAACCTG GAGGCCAGCT 1680 

60 GGAGAGCATG CGACTACCCC TCGTGACAGA CAAGAGCCAT GAATATATAA AAAGTCCGAC 1740 
ATTCACTCCA ACCACCGGCC GTCACGAACA TGGGCTCTTT AATCTGTACC ACGCA'ATGGA 1800 
CGGTGCCAGC CATTTGCACG TGCTGGTTGT CAAGGAATAC GAGATGGCAA TTTATAAGAA 1860 
ATATTGGCCC AACCACATCA TGCTGGTGCT CCCCAGTATC TTCAACAGTG CTGG AGTTGG 1920 
TGCTGCTCAT TTCCTCATCA AGGAGCTGTC CTACCATAAC CTGGAGCTCG AGCGGAACCG 1980 

65 GCAGGAGGAG CTGGGAATCA AGCCGCAGGA CATCTGGCCT TTCATTGTGA TCTCTGATGA 2040 
CTCCTGCGTG ATGTGGAACG TGGTGGATGT CAACTCTGCT GGGGAGAGAA GCAGGGAGTT 2100 
CTCCTGGTCG GAAAGGAACG TGTCTTTGAA GCACATCATG CAGCACATCG AGGCGGCCCC 2160 
CGACATCATG CACTACGCCC TGCTGGGCCT GCGGAAGTGG TCCAGCAAGA CCCGGGCCAG 2220 

„ CGAGGTGCAA GAGCCCTTCT CCCGCTGCCA CGTGC ACAAC TTCATCATCC TGAACGTGGA 2280 

70 CCTGACCCAG AACGTGCAGT ACAACCAGAA CCGGTTCCTG TGTGACGATG TAGACTTCAA 2340 
CCTGCGGGTG CACAGCGCCG GCCTCCTGCT CTGCCGGTTC AACCGCTTCA GCGTGATGAA 2400 
GAAGCAGATC GTGGTGGGCG GCCACAGGTC CTTCCACATC ACATCCAAGG TGTCTGATAA 2460 
CTCTGCCGCG GTCGTGCCGG CCCAGTACAT CTGTGCCCCG GACAGCAAGC ACACGTTCCT 2520 

_ OGCAGCGCCC GCCCAGCTCC TGCTGGAG AA GTTCCTGCAG CACCACAGCC ACCTCTTCTT 2580 

75 CCCGCTGTCC CTGAAGAACC ATGACCACCC AGTGCTGTCT GTCG ACTGTT ACCTGAACCT 2640 
GGGATCTCAG ATTTCTGTTT GCTATGTG AG CTCCAGGCCC CACTCTTTAA ACATCAGCTG 2700 
CTCGGACTTG CTGTTCAGTG GGCTGCTGCT GTACCTCTGT GACTCTTTTG TGGGAGCTAG 2760 
CTTT1 1 GAAA AAGTTTCATT TTCTGAAAGG TGCGACGTTG TGTGTCATCT GTCAGGACCG 2820 
GAGCTCACTG CGCCAGACGG TCGTCCGCCT GGAGCTCGAG GACGAGTGGC AGTTCCGGCT 2880 
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CCGCGATG AG TTCCAGACCG CCAATGCCAG GGAAGACCGG CCGCTCTTTT TTCTGACGGG 2940 
ACGACACATC TGAGGAAGAC AGCGGCGAGT TTTCTGAAGA GATGAGTGCT CAGAGCCCTC 3000 
ATGCTGTTGA GGCTAAAGGG AGGCCTGGAA CGGTGGGGCG TTTGACTGGA ATGGACCCCA 3060 
GGGACTGTCC AGGTGCAGCC CCTCCTAGTA CACATGGGCC CCCGAGGCCG TGGTCCTGGG 3120 
AGCCAGGAAG ACTCCGCAGT GGGTGAGAAT GAAAACTTGA GACTCCCAAG TTCTGGGCCA 3 180 
GCCCATTGCT CTGGGCTGTT TTAAAGCCCA TTTCACGAGG AACA AAGATT TACTTCCTGT 3240 
CCTGCCATTC GTGTGCTTCC ATGGACAAAC CTGATTTTTT TCTCTTAGTT CTAAAGAATC 3300 
TTGGGTTATT TTGTAGCGGT GCCAGTATTT CAGTAGATGG GATTTCAGCC AAGTAGGTTC 3360 
CCCTGTAACC TCCTACAAAG CAATATTCCA AAGGAACATT TTAACTGTA A AGGCTGGAGA 3420 
CAAGAAAAAA TAAGTAGATC GTTTTAATAA CAATTATTTA ATTGCCTATA AGTTTGCTGT 3480 
TTCAGAGGCT AGCCCAAAGG CATCAAATTT AATAAAGTTA AACAAATTGA TTTACTTCAG 3540 
AGCAAATATG ATCCTATTAA AATAATATAG GGTAAATACC CTACCTCTTA GAAAGGGCAA 3600 
AAATGCAAAG AAGCTTTCTT TAAAACTAAA AGGG 1 J 1 1 1 1 GGGGGGGGAG TTGGCGGGG A 3660 
GGAAATAAGG CTAACAGAGG TTGACCTAAA ATTAGCCTTA CAAAGGAGAA AGGACCACAT 3720 
TGCTTACTTG AAACAG ACAA TGAAA ACAAC CAAAGTG ATA TATAAAATAG TTGATGAGAA 3780 
CTAGACTTAT GACTGTAGTT TACTAGAGTT TAGTTTTCAG TTGCTGAAGT AGCTC ATTTT 3840 
CTCTTACTAA TGTTTGGTTC CTCAGGGAAG AATCTCACTT GACTAGAGAG GAGGTGGGAA 3900 
CAGAAG AGAG AAGG AGGCAG GGAGATGTAT TTCTTAGGGC TCACCCCTTC ACAGACTGAC 3960 
AGAATGGTTT TGTTTTGTTT TGTTTTGTTT TGTTTTGTTT TTGAGATGGA CTCTAGCTCT 4020 
GTCACCCAGG CTGGAGTGCA GTGGTGCGAT CTCGGCTCAC TGCAAGCTCC GCCTCCCGGG 4080 
TTCTCACCAT TCTCCTGCCT CAGCCTCCCG AGTAGCTGGG ACTACAGGCG CCCACCACCA 4140 
CGCCCGGCTA ATTTTTTGTA TTTTTTAGTA GAGACGGGGT TTCACCATGT TAGCCAGG AT 4200 
GGTCTCGATC TCCTG ACCTC GTGATCCGCC CGCCTCGGCC TCCCAAAGTG CTGGGATTAC 4260 
AGGCGTG AGC CACCGTGCCT GCCCCAGAAT G G 1 1 11 1 AAA GCCACAGTTG AGAGGCCACC 4320 
CATTGCCCGG CGCCTGG ACA GTGATC ATCT TGTTCATCTT GTTCAGTCCT TTCTTGTGTG 4380 
ATTGGAATTA TTCATCCCCT TTGAAAGATG AGAAGGTTGA GATGCAAAGA GTCTACCTTT 4440 
CCAAGTTCTC ACTGCTGGAA AGAGCTAGAA GCACAGTTCA AAGTTCTGGC TTCTGGACTC 4500 
TGCAGTCCAG GTCTCCCTTC TCCCACTTGC CTACCCTCAA TGCCACACTG TTTTTGAAGT 4560 
GGCCCATAAC TTGAAGGAAA AGTTTAAAGA CAGTTCAATT TAATCATCAG AATGCATTCT 4620 
TTTTTTTTTC GGAGACGGAG TTTCACTCTT GCTGCCCAGG CTGGAGTGCA ATGGTGCAAT 4680 
GATCTCGGCT CACTGCAACC TCTGCCTCCT GGGTTCAAGT GATTCTCCAG CCTCAGCCTC 4740 
CCGAGTAGCT GGG ATTATGG GCGCCCACCA CCATGCCCAG CTAATTTTTG TATTTTTTTT 4800 
TTTTAGTAGA GATGGGGTTT CGCCAGGTTG GCCAGGCTGG TCTTGTGAAC TCCTGGCCTC 4860 
AGGTG ATCTG CCCACCTCAT CCTCCAAA AG TGCTGGGATT ACAGGCATGA GCCACTGCGC 4920 
CTGGCCTCAG AATGCATTCT TACACATCTA TCCTAG ACAT TTATAAGCAC TCTAATGGAT 4980 
AACAATCCAA G AATAAATGA TTGTAAAAG A TGATGCCGAA GAGTTGATGT CAATCTTTTT 5040 
TTCCTAAGAA AAAAAGTCCG CGAGTATTAA ATATTTAGAT CAATGTTTAT AAAATGATTA 5100 
CTTTGTATAT CTCATTATTC CTATTTTGGA ATAAAAACTG ACCTTCTTTA ATCATATACT 5160 
TGTCTTTTGT AAATAGCAGC TTTTGTGTCA TTCTCCCCAC TTTATTAGTT AATTTAAATT 5220 
GGAAAAAACC CTCAAACTAA TATTCTTGTC TGTTCCAGTC TTATAAATAA AACTTATAAT 5280 
GCATG 



SEQ ID NO:154 PFD6 Protein sequence: 
Protein Accession!: NP_0554811 

1 11 21 31 41 51 
I I I I I I 

MWQK1EDVEW RPQTYLELEG LPCILIFSGM DPHGESLPRS LRYCDLRUN SSCLVRTALE 60 
QELGLAAYFV SNEVPLEKGA RNEALESDAE KLSSTDNEDE ELGTEGSTSE KR5PMKRERS 120 
RSHDSASSSL SSKASGSALG GESS AQPTAL PQGEHARSPQ PRGPAEEGRA PGEXQRPRAS 180 
QGPPSAISRH SPGPTPQPDC SLRTGQRSVQ VSVTSSCSQL SSSSGSSSSS VAPAAGTWVL 240 
QASQCSLTKA CRQPPIVFLP KLVYDMVVST DSSGLPKAAS LLPSPSVMWA SSFRPLLSKT 300 
MTSTEQSLYY RQWTVPRPSH MDYGNRAEGR VDGFHPRRLL LSGPPQIGKT G A YLQFLS VL 360 
SRML.VRLTEV DVYDEEEIN1 NLREESDWHY LQLSDPWPDL ELFKKLPFDY IIHDPKYEDA 420 
SLICSHYQGI KSEDRGMSRK PEDLYVRRQT ARMRLSKYAA YNTYHHCEQC HQYMGFHPRY 480 
QLYESTLHAF AFSYSMLGEE IQLHHIPKS KEHHFVFSQP GGQLESMRLP LVTDKSHEY1 540 
KSPTFTPTTG RHEHGLFNLY HAMDGASHLH VLWKEYEMA IYKKYWPNH1 MLVLPSIFNS 600 
AGVGAAHFU KELS YHMLEL ERNRQEELGI KPQDIWPHV ISDDSCVMWN WDVNSAGER 660 
SREFSWSERN VSLKHIMQW EAAPDIMHYA LLGLRKWSSK TRASEVQEPF SRCHVHNFO 720 
LNVDLTQNVQ YNQNRFLCDD VDFNLR VHS A GLUUCRFNRF SVMKKQIWG GHRSFHITSK 780 
VSDNSAAWP AQY1CAPDSK HTFLAAPAQL LLEKFLQIIHS HLFFPLSLKN HDHPVLSVDC 840 
YLNLGSQISV CYVSSRPHSL N1SCSDLLFS GLLLYLCDSF VGASFLKKFH FLKGATLCVI 900 
CQDRSSLRQT VVRLELEDEW QFRLRDEFQT ANAREDRPLF FLTGRHI 



SEQ ID NO:155 PFC6 DMA SEQUENCE 

Nucleic Acid Accession I: NM_000522 

Coding sequence: 1-1 167 (underlined sequences correspond to start and stop codons) 

I 11 21 31 41 51 
I I I I I I 

ATGACAGCCT CCGTGCTCCT CCACCCCCGC TGG ATCGAGC CCACCGTCAT GTTTCTCTAC 60 
GACAACGGCG GCGGCCTGGT GCCCGACGAG CTCAACAAGA ACATGGAAGG GGCGGCGGCG 120 
GCTGCAGCAG CGGCTGCAGC GGCCGCGGCT GCCGGGGCCG GGGGCGGGGG CTTCCCCCAC 180 
CCGGCGGCTG CGGCGGCAGG GGGCAACTTC TCGGTGGCGG CCGCGGCCGC GGCTGCGGCG 240 
GCCGCCGCGG CCAACCAGTG CCGCAACCTG ATGGCGCACC CGGCGCCCTT GGCGCCAGGA 300 
GCCGCGTCCG CCTACAGCAG CGCCCCCGGG GAGGCGCCCC CGTCGGCTGC CGCCGCTGCT 360 
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GCCGCGGCTG CCGCTGCAGC CGCCGCCGCC GCCGCCCCGT CGTCCTCGGG AGGTCCCGGC 420 
COGGOGCGCC CGGCGGCGGC AGAGGCGGCC AACCAATGCA GCCCCTGCTC GGCAGCGGCC 480 
CAG AGCTCGT CGGGGCCCGC GGCGCTGCCC TATGGCTACT TCGGCAGCGG CTACTACCCG 540 
TGCGCCCGCA TGGGCCCGCC CCCCAACGCC ATCAAGTCGT GCCCCCAGCC CCCCTCGGCC 600 
GCCGCCGCCG CCGCCTTCGC GGACAAGTAC ATGGATACCG CCGGCCCAGC TGCCG AGGAG 660 
TTCAGCTCCC GCGCTA AGGA GTTCGCGTTC TACCACCAGG GCTACGCAGC CXjGGCCTTAC 720 
CACCACCATC AGCCCATGCC TGGCTACCTG GATATGCCAG TGGTGCCGGG CCTCGGGGGC 780 
CCCGGCGAGT CGCGCCACGA ACCCTTGGGTCTTCCCATGG AAAGCTACCA GCCCTGGGCG 840 
CTGCCCAACG GCTCG AACGG CCAAATGTAC TGCCCCAAAG AGCAGGCGCA GCCTCCCCAC 900 
CTCTGGA AGT CCACTCTGCC CG ACGTGGTC TCCCATCCCT CGGATGCCAG CTCCTATAGG 960 
AGGGGGAGAA AGAAGCGCGT GCCTTATACC AAGGTGCAAT TAAAAGAACT TGAACGGGAA 1020 
TACGCCACGA ATAAATTCAT TACTAAGGAC AAACGGAGGC GGATATCAGC CACGACGAAT 1080 
CTCTCTGAGC GGCAGGTCAC AATCTGGTTC CAGAACAGGA GGGTTAAAGA GAAAAAAGTC 1 140 
ATCAACAAAC TGAAAACCAC TAGTTAA 



SEQ D NO:156 PFC6 Protein sequence: 
Protein Accession #: NP_0005m 

1 II 21 31 41 51 
I I I I I I 

MTA5VLLHPR WIEPTVMFLY DNGGGLVADE LNKNMEGAAA AAAAAAAAAA AGAGGGGFPH 60 
PAAAAAGGNF SVAAAAAA AA AA AANQCRNL MAHPAPLAPG AAS AYSS APG EAPPSAAAAA 120 
AAAAAAAAAA AAASSSGGPG PAGPAAAEAA KQCSPCSAAA QSSSGPAALP YGYPGSGYYP 180 
CARMGPPPNA IKSCPQPPS A AAAAAFADKY MDTAGPAAEE FSSRAKEFAF YHQGYAAGPY 240 
HHHQPMPGYL DMPWPGLGG PGESRHEPLG LPMESYQPWA LPNGWNGQMY CPKEQAQPPH 300 
LWKSTLPDW S HPS DAS SYR RGRKKRVPYT KVQLKELERB YATNKFITKD KRRRJSATTN 360 
LSERQVTIWF QNRRVKEKKV INKLKTTS 



SEQ ID KO:157 PFA3 DMA SEQUENCE 

Nucleic Acid Accession f: AW102723 

Coding sequence: 523-2676 (underlined sequences correspond to start and slop codons) 

1 II 21 31 41 51 
I I I I I I 

CCCTTATGGC GATTGGGCGG CTGCAGAGAC CAGGACTCAG TTCCCCTGCC CTAGTCTGAG 60 
CCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAGAAG CAGGTTTCAG TTGCAGAGTT 120 
TTCCTACACT TTTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGCGG AGGAC 1 80 
ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACCC CAGCCGGGCG TGATCTCACC 240 
ATGTGCGGAT TTGCGAGGCG CGCCCTGGAG CTGCTAGAGA TCCGGAAGCA CAGCCCCGAG 300 
GTGTGCGAAG CCACCAAG AC TGCGGCTCTT GGAGAAAGCG TGAGCAGGGG GCCACCGCGG 360 
TCTCCGGCCT GTCTGCACCC TGTCGCCTGA GCTGCCTGAC AGTGACAATG ACATCCCAGT 420 
TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TCTCATATAA GAACTACAGC 480 
TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA CCATQTTCTG CACGAAGCTC 540 
AAGGATCTCA AGATCACAGG AGAGTGTCCT TTCTCCTTAC TGGCACCAGG TCAAGTTCCT 600 
AACGAGTCTT CAGAGGAGGC AGCAGGAAGC TCAGAGAGCT GCAAAGCAAC CGTGCCCATC 660 
TGTCAAG ACA TTCCTGAGAA GAACATACAA GAAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 
AGCCGAGTCT ATCTTCACAC TTTGGCAGAG AGTATTTGCA AACTGATTTT CCCAGAGTTT 780 
GAACGGCTGA ATGTTGCACT TCAGAGA ACA TTGGCAAAGC ACAAAATAAA AG AAAGCAGG 840 
AAATCTTTGG AAAGAGAAGA CTTTGAAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 
CCAGTGGAGT TATCAAAGAA TCTCTTGGTG AAGAGGTTTT TAAAATATGT TACG AGGAAG 960 
ATGAAAACAT CCTTGGGGTG GTTGG AGGCA CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 
CCTTCTGAAA CAGAGCAGCC ATTGCCAAGA AGCAGGAAAA AGGGGCAGCT TGAGGACGCC 1080 
TCCATTCTAT GCCTGGATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 
AGAACCACCT CCCTG ATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATG AA 1200 
ACGGAAGTGG AAGTGTCGTT AATGCCTCCC TGCTTCCATA ATGATTGCAG CGAGTTTGTG 1260 
AATCAGCCCT ACTTGTTGTA CTCCGTTCAC ATGAAAAGCA CCAAGCCATC CC7GTCCCCC 1320 
AGCAAACCCC AGTCCTCGCT GGTGATTCCC ACATCGCTAT TCTGCAAG AC ATTTCCATTC 1 380 
CATT7XATGT TTGACAAAGA TATGACAATT CTGCAATTTG GCAATGGCAT CAG AAGGCTG 1440 
ATGAACAGGA GAGACTTTCA AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1 500 
AAAATCAACC AGACCTTTAG CGGGATCATG ACTATGTTGA ATATGCAGTT TGTTGTACGA 1560 
GTG AGGAG AT GGGACAACTC TGTGAAGAAA TCTTCAAGGG TTATGGACCT CAAAGGCCAA 1 620 
ATGATCTACA TTGTTGAATC CAGTGCAATC TTGTTTTTGG GGTCACCCTG TGTGGACAGA 1680 
TTAGAAG ATT TTACAGGACG AGGGCTCTAC CTCTCAG ACA TCCCAATTCA CA ATGCACTG 1740 
AGGGATGTGG TCTTAATAGG GG AACAAGCC CGAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1 800 
GGG AAGCTG A AGGCTACCCT TG AGCAAGCC CACCAAGCCC TGGAGGAGG A GAAGAAAAAG 1 860 
ACAGTAGACC TTCTGTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 
CAAGTTGTGC AAGCCAAG AA GTTCAGTA AT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1 980 
TTCACTGCCA TCTGCTCCCA GTGCTCACCG CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 
TACACTCGCT TCGACGAGCA GTGTGGAGAG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 
ATGCCTATTG TGTGGCTTGG GGGATTACAC AAAGAGAGTG ATACTCATGC TGTTC AG ATA 2160 
GCGCTGATGG CCCTG AAGAT G ATGG AGCTC TCTGATGAAG TTATGTCTCC CCATGGAGAA 2220 
CCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT CGTTGGAGTT 2280 
AAAATGCCCC GTTACTGTCT TTTTGGAAAC AATGTCACTC TGGCTAACAA ATTTG AGTCC 2340 
TGCAGTGTAC CACGAAAAAT CAATGTCAGC CCAACAACTT ACAGATTACT CAAAGACTGT 2400 
CCTGGTTTCG TGTTTACCCC TCGATCAAGG GAGGAACTTC CACCAAACTT CCCTAGTG AA 2460 
ATCCCCGG AA TCTGCCATTT TCTGGATGCT TACCAACAAG GAACAAACTC AAAACCATGC 2520 



362 



WO 02/30268 



TTCCAAAAGA AAGATGTGGA AGATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAGA 2580 
TTAGCAACCT ATATACCTAT TTATAAGTCT TTGGGGTTTG ACTCATTG AA GATGTGTAGA 2640 
GCCTCTGAAA GCACTTTAGG GATTGTAGAT GG CTAA CAAG CAGTATTAAA ATTTCAGGAG 2700 
CCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 
TCTTCAAGAA AAAAAAAA AA ACCTTAAAAA GCTA C11I IG TGGGAGTATT TCTATTATAT 2820 
AACCAGCACT TACTACCTGT ACTCAA AATT CAGCACCTTG TACATATATC AGATAATTGT 2880 
AGTCAATTGT ACAAACTG AT GG AGTCACCT GCA ATCTCAT ATCCTGGTGG AATGCCATGG 2940 
TTATTAAAGT GTGTTTGTGA TAGTTGTCGT CA AAAAAAAA AA AAAAAAAA AAAAAAAAAA 3000 
AAAA 



SEQ ID NO:158 PFA3 Protein sequence: 
Protein Accession #: NP_0OC847.1 



1 11 21 31 41 51 
I I I I I I 

MFCTKLKDLK ITGECPFSLL APGQVPNESS EEA AGSSESC KATVPICQDI PEKNIQESLP 60 
QRKTSRSRVY LHTLAES1CK UFPEFERLN VALQRTLAKH K1KESRKSLE REDFEKTIAE 120 
X)A VQQSPVEL SKNLLVKRFL KYVTRKMKTS LG WLEAPLK1 FKQLQYPSET EQPLPRSRKK 180 
GQLEDASILC LDKEDDFLHV YYFFPKRTTS ULPGIIKAA AHVLYETEVE VSLMPPCFHN 240 
DCSEFVNQPY LLYS VHMKST KPSLSPSKPQ SSLV1PTSLF CKTFPFHFMF DKDMTILQFG 300 
NGIRRLMNRR DFQGKFNFEY FEILTPKINQ TFSGIMTMLN MQFWRVRRW DNSVKKSSRV 360 
MDLKGQMIYI VESSAILFLG SPCVDRLEDF TGRGLYLSDI PIHNALRDVV UGEQARAQD 420 
GLKKRLGKLK ATLEQAHQAL EEEKKKTVDL LCSIFPCEVA QQLWQGQWQ AKKFSNVTML 480 
FSDIVGFTAI CSQCSPLQVI TMLNALYTRF DQQCGELDVY KVETIAMHV WLGGLHKESD 540 
THAVQIALMA LKMMELSDEV MSPHGEP1KM RIGLH5GSVF AGWGVKMPR YCLFGNNVTL 600 
ANKFESCSVP RKINVSPTTY RLLKDCPGFV FTPRSREELP PNFPSEIPGI CHFLDAYQQG 660 
TNSKPCFQKK DVEDASQFFR QSIRNRLATY IPIYKSUGFD SLKMCRASES TLGIVDG 



SEQ ID NO: 159 PFA1 DMA SEQUENCE 

Nucleic Add Accession #: NM_0043S2 

Coding sequence: 102-1 934 (undented sequences correspond to start and stop codons) 

1 11 21 31 41 51 
1(1111 

CGCCGGCGGG ACTGGTCTG A AGAGACGCGG GGACAAAGTG GCAACG ACTT GGACATCTG A 60 
GCTGTCACTG CCGAAAACAG GCCGCAAGAG AGATAATCAA TATGCATTTC CAAGCCTTTT 120 
GGCTATGTTT GGGTCTTCTG TTCATCTCA A TTAATGCAG A ATTTATGGAT GATGATGTTG 180 
AGACCGAAGA CTTTGAAGAA AATTCAGAAG AAATTG ATGT TAATG AAAGT GAACTTTCCT 240 
CAGAGATTAA ATATAAGACA CCTCAACCTA TAGG AGAAGT ATATTTTGCA GAAACTTTTG 300 
ATAGTGG AAG GTTGGCTGGA TGGGTCTTAT CAAAAGCAAA GAAAGATG AC ATGGATGAGG 360 
AAATTTCAAT ATACGATGGA AGATGGG AAA TTGAAGAGTT GAAAGAAAAC CAGGTACCTG 420 
GTGACAGAGG A CTGGTA TTA AAATCTAGAG CA A AGCATCA TGCAATATCT GCTGTATTAG 480 
CAAAACCATT CATTTTTGCT GATAAACCCT TGATAGTTCA ATATG AAGTA AATTTTCAAG 540 
ATGGTATTG A TTGTGGAGGT GCATACATTA AACTCCTAGC AGACACTGAT GATTTGATTC 600 
TGGAAAACTT TTATG ATAAA ACATCCTATA TCATTATGTT TGG ACCAGAT AAATGTGGAG 660 
AAGATTATAA ACTTCATTTT ATCTTCAGAC ATAAA CATCC CAAAACTGGA GTTTTCGAAG 720 
AG AAACATGC CAAACCTCCA G ATGTAGACC TTAAAAAGTT CTTTACAG AC AGGAAG ACTC 780 
ATCTTTATAC CCTTGTGATG AATCCAGATG ACACATTTGA GGTGTTAGTT G ATCAAACAG 840 
TTGTAAACAA AGGAAGCCTC CTAGAGGATG TGGTTCCTCC TATCAAACCT CCCAAAGAAA 900 
TTGAAGATCC CAATG ATAAA AAACCTGAGG AATGGGATGA AAGAGCAAAA ATTCCTGATC 960 
CTTCTGCCGT CAAACCAGAA GACTGGGATG AAAGTGAACC TGCCCAAATA GAAGATTCAA 1020 
GTGTTGTTAA ACCTGCTGGC TGGCTTG ATG ATGAACCAAA ATTTATCCCT GATCCTAATG 1080 
CTG AAAAACC TGATGACTGG AATGAAG ACA CGGATGGAGA ATGGGAGGCA CCTCAGATTC 1 140 
TTAATCCAGC ATGTCGGATT GGGTGTGGTG AGTGGAAACC TCCCATGATA G ATAACCCAA 1200 
AATACAAAGG AGTATGGAGA CCTCCACTGG TCGATAATCC TAACTATCAG GGAATCTGGA 1260 
GTCCTCGAAA AATTCCTAAT CCAGATTATT TOG AAG ATG A TCATCCATTT CTTCTGACTT 1320 
CTTTCAGTGC TCTTGGTTTA GAGCTTTGGT CTATGACCTC TGATATCTAC TTTGATAATT 1380 
TTATTATCTG TTCGGAAAAG G AAGTAGCAG ATCACTGGGC TGCAGATGGT TGGAGATGGA 1440 
AAATAATGAT AGCAAATGCT AATAAGCCTG GTGTATTAAA ACAGTTAATG GCAGCTGCTG 1500 
AAGGGCACCC ATGGCTTTGG TTGATTTATC TTGTGACAGC AGGAGTGCCA ATAGCATTAA 1560 
TTACTTCATT TTGTTGGCCA AGAAAAGTAA AGAAAAAACA TAAAGATACA GAGTATAAAA 1620 
AAACCGACAT ATGTATACCA CAAACAAAAG GAGTACTAGA GCAAGAAGAA AAGGAAGAGA 1680 
AAGCAGCCCT GG AAAAACCA ATGGACCTGG AAGAGGA AAA AAAGCAA AAT GATGGTGAAA 1740 
TGCTTGAAAA AGAAGAGGAA AGTGAACCTG AGGAAAAGAG TGAAGAAGAA ATTGAAATCA 1800 
TAGAAGGGCA AGAAGAAAGT AATCA ATCAA ATAAGTCTGG GTCAG AGGAT GAGATGAAAG 1 860 
AAGCAGATGA GAGCACAGGA TCTGGAGATG GGCCG ATAAA GTCAGTACGC AAAAGAAG AG 1920 
TACGAAAGGA CIMACTAGA TTGAAATATT TTTAATTCCC GAGAGGATGT TTGGCATTGT 1980 
AAAAATCAGC ATGCCAGACC TGAACTTTAA TCAGTCTGCA C ATCCTGTTT CTAATATCTA 2040 
GCAACATTAT ATTCTTTCAG ACATTTATTT TAGTCCTTCA TTTCCGAGGA AAAAGAAGCA 2100 
ACTTTGAAGT TACCTCATCT TTGAATTTAG AATAAAAGTG GCACATTACA TATCGGATCT 2160 
AAGAGATTAA TACCATTAGA AGTTACACAG TTTTAGTTGT TTGG AGATAG TTTTGGTTTG 2220 
TACAGAACAA AATAATATGT AGCAGCTTCA TTGCTATTGG AAAAATCAGT TATTGGAATT 2280 
TCCACTTAAA TGGCTATACA ACAATATAAC TGGTAGTTCT ATAATAAAAA TGAGCATATG 2340 
TTCTGTTGTG AAGAGCTAAA TGCAATAAAG TTTCTGTATG GTTGTTTGAT TCTATCAACA 2400 
ATTGAAAGTG TTGTATATCA CCCACATTTA CCTAGTTTCT GTCAAATTAT AGTTACAGTG 2460 
AGTTGTTTGC TTAAATTATA GATTCCTTTA AGG ACATGCC TTGTTCATAA AATCACTGGA 2520 



363 



WO 02/30268 



PCT/US01/32045 



TTATATTGCA GCATATTTTA CA TTTG AATA CAAGG ATAAT GGGTTTTATC AAAACAAAAT 2580 
GATGTACAGA TTTTTTTTCA AG 1 1 T IT ATA GTTGCTTTAT GCCAGAGTGG TTTACCCCAT 2640 
TCACAA AATT TCTTATGCAT ACATTGCTAT TGAA AATAA A ATTTAAATAT TTTTTCATCC 2700 
^ TGAAAAAAAA 

SEQ P N0:160 PFA1 Protein swuencg 
Protein Accession* NPJW353.1 

10 1 11 21 31 41 51 
I I t I I I 

MHFQAFWLCL GLLF1SINAE FMDDDVETED FEENSEEIDV NESELSSEIK YKTPQPIGEV 60 
YFAETFDSGR LAGWVLSKAK KDDMDEEISI YDGRWEIEEL KENQVPGDRG LVLKSRAKHH 120 
A1SAVLAKPF IFADKPLIVQ YEVNFQDGID CGGAY1KLLA DTDDULENF YDKTSYHMF 180 

15 GPDKCGEDYK LHFCFRHKHP KTGVFEEKHA KPPDVDLKKF FTDRKTHLYT LVMNPDDTFE 240 
VLVDQTVVN K GSLLEDVVPP IKPPKEIEDP NDKKPEEWDE RAKJPDPS A V KPEDWDESEP 300 
AQIEDSSVVK PAGWLDDEPK FIPDPNAEKP DDWNEDTDGE WEAPQILNPA CRJGCGEWKP 360 
PMIDNPKYKG VWRPPLVDNP NYQGIWSPRK IPNPDYFEDD HPFLLTSFSA LGLELWSMTS 420 
DIYFDNFIIC SEKEVADHWA ADGWRWK1MI AN ANKPG VLK QLMAAAEGHP WLWLIYLVTA 480 

20 G VPLALTTSF CWPRKVKKKH KDTEYKKTDI CIPQTKGVLE QEEKEEKAAL EKPMDLEEEK 540 
KQNDGEMLEK EEESEPEEKS EEE1EIIEGQ EESNQSNKSG SEDEMKEADE STGSGDGP1K 600 
SVRKRRVRKD 



25 SEQ ID NO: 161 PEZ9 DNA SEQUENCE 

Nucleic Add Accession*: NM.0Q5932 

Coding sequence: 75-2216 (undefined sequences correspond lo start and stop codons) 
1 11 21 31 41 51 

30 | | | i i | 

GCGGAGCGCG CGCTCCCAGC GA AAGCAGCA GGGCAGGGAT CTGCGTTGGA GGAAGGGACT 60 
GCTCTGGTGC TAGAATQCTG TGCGTCGGAA GGCTGGGCGG CTTGGGAGCC AGAGCAGCAG 120 
CTCTGCCGCC CCGCCGGGCG GGCCGGGGAA GCCTCG AAGC CGGGATCCGG GCCCGAAGGG 1 80 
TCAGCACCAG CTGGTCTCCC GTGGGCGCCG CCTTCAATGT CAAGCCCCAG GGCAGCCGCT 240 

3 5 TGG ACCTGTT CGGCG AGCGG GCGCGTCTTT TTGGAGTTCC TG AGCTGAGT GCCCCAG AAG 300 
GATTTCATAT TGCACAAGA A AAAGCCTTGA GAAAGACAG A ATTGCTTGTG GACCGTGCAT 360 
GTTCCACCCC ACCTGGGCCC CAG ACCGTGC TGATCTTCGA TGAGCTCTCG GATTCCTTAT 420 
GCAGAGTGGC CGACTTGGCT GATTTTGTGA AAATCGCTCA CCCTGAGCCA GCATTCAGAG 480 
AAGCTGCGGA AGAAGCTTGT AGAAGTATTG GCACCATGGT AG AG AAGTTG AACACAAATG 540 

40 TGGATTTATA TCAAAGTTTG CAAAAATTAC TAGCTG ATAA AAAACTTGTG GATTCCCTTG 600 
ATCCAGA AAC A AGGCG AGTG GCTGA ACTGT TTATGTTTGA TTTTGAAATT AGTGGAATCC 660 
ATCTAG ACAA ACAAAAGCGT AAAAGAGCAG TGGACCTCAA TGTTAA AATC TTGGATTTG A 720 
GTAGTACATT TCTTATGGGA ACCAATTTTC CCAACAAGAT TG AGAAGCAT CTCTTACCAG 780 
AACACATTCG TCGTAACTTT ACATCTGCTG GGGATCATAT CATAATTGAT GCTCTCCACG 840 

45 CAGAATCACC AGATGACTTG GTGCGAGAAG CTGCTTATAA AATTTTTCTT TATCCCAATG 900 
CTGGTCAATT GAAATGTTTA GAAGAATTGC TCAGCAGCAG AGATCTTCTG GCAAAGTTGG 960 
TGGGGTATTC CACGTTTTCT CACAGGGCTC TCCAAGGAAC GATAGCTAAA AATCCAGAGA 1020 
CTGTCATGCA GTTCCTTG AA AAACTATCTG ACAAACTTTC TGAAAGAACT CTGAAAGATT 1080 
TTGAGATGAT ACGAGGGATG AAAATGAAAC TGAATGCTCA AAATTCCGAA GTAATGCCCT 1 140 

50 GGGACCCCCC TTACTACAGT GGTGTGATTC GTGCAGAAAG GTATAATATT GAGCCCAGCC 1200 
TATATTGCCC GTTTTTCTCT CTTGGAGCAT GCATGGAAGG CCTGAATATT TTGCTTAACA 1260 
GACTGTTGGG GATTTCATTA TATGCAG AGC AGCCTGCAAA AGGAGAGGTG TGGAGCG AAG 1320 
A TGTCCGA AA ACTGGCTGTT GTTCATG AAT CTGAAGGATT GTTGGGGTAC ATTTACTGTG 1380 
ATTTTTTTCA GCGAGCAGAC AAACCACATC AGGATTGCCA TTTCACTATC CGTGGAGGCA 1440 

55 GACTAAAGGA AGATGGAGAC TATCAACTCC CACTTGTAGT TCTTATGCTG AATCTTCCCC 1500 
GTTCCTCAAG GAGTTCTCCA ACTTTGCTAA CTCCTGGCAT GATGGAAAAT CTTTTCCATG 1560 
AAATGGGACA TGCCATGC AT TCAATGCTAG G ACGTACTCG TTACC AACAC GTCACTGGG A 1 620 
CCAGGTGCCC TACTG ATTTT GCTGAGGTTC CTTCTATTCT GATGGAGTAC TTTGCAAATG 1 680 

, A ATTATCGAGT AGTTAACCAA TTTGCCAGAC ATTATCAGAC TGGACAGCCA CTGCCAAAAA 1740 

60 ATATGGTGTC TCGTCTTTGT G AATCTAAAA AGGTTTGTGC TGCAGCTGAT ATGCAACTTC 1800 
AGGTCTTTTA TGCCACTCTG GATCAAATCT ACCATGGGAA GCATCCCCTG AGGAATTCAA 1860 
CCACAGACAT TCTCA AGG AA ACACAAGAGA AATTCTATGG CCTACCATAT GTTCCAAATA 1920 
CTGCCTGGCA GCTGCGATTC AGCCACCTCG TGGGGTATGG TGCTAGATAT TACTCTTACC 1980 
TCATGTCCAG AGCGGTCGCC TCCATGGTTT GG AAG G AGTG TTTTCTACAG GATCCTTTCA 2040 

05 ACAGGGCTGC CGGGGAGCGC TATCGCAGGG AGATGCTGGC CCACGGTGGA GGCAGCGAGC 2100 
CCATGCTCAT GGTTGAAGGT ATGCTTCAGA AGTGTCCTTC TGTTGATGAC TTCGTAAGTG 2160 
CCCTCGTTTC CGACTTGGAT CTGG ACTTCG AAACTTTCCT CATGGATTCT GAATAAAAGA 2220 
AACACTCTAC ACCTCTAATC AAGGTCATGT AGTAATGACT TTGTTATAAA TGCTACAGCT 2280 
GTGAGAGCTT GTTTCTGATT GTTTCATTGT TCGCTTCTGT AATTCTGAAA AACTTTAAAC 2340 

70 TGGTAGAACT TGGAATAAAT AATTTGTTTT AATTAAAAAA AAAAAAAAAA AA 



75 



§FQ ID NO:162 PE?9 Prntrin ^i?^; 

Protein Accession •: NP_005923. 1 

I 11 21 31 41 51 
I I I I I I 

MUCVGRLGGL GARAAALPPR RAGRGSUEAG IRARRVSTSW SPVGAAFNVK PQGSRLDLFG 60 
ERARLFGVPE LSAPEGFHIA QEKALRKTEL LVDRACSTPP GPQTVUFDE LSDSLCRVAD 120 
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LADFVK1AHP EPAFREAAEE ACRSIGTMVE KLNTNVDLYQ SLQKLLADKK LVDSLDPETR 1 80 
RVAELFMFDF EISGIHLDKQ KRKRA VDLNV KILDLSSTFL MCTNFPNKIE KHLLPEHIRR 240 
NFTSAGDHU EDGLHAESPD DLVREAAYKI FLYPNAGQLK CLEELLSSRD LLAKLVGYST 300 
FSHRALQGTI AKNPETVMQF LEKLSDKLSE RTLKDFEMER GMKMKLNAQN SEVMPWDPPY 360 
YSGVERAERY NIEPSLYCPF FSLGACMEGL NILLNRLLGI SLYAEQPAKG EVWSEDVRKL 420 
A V VHESEGLL G YIYCDFFQR ADKPHQDCHF TIRGGRLKED GDYQLPLWL MLNLPRSSRS 480 
SPTLLTPGMM ENLFHEMGHA MHSMLGRTRY QHVTGTRCPT DFAEVPSOM EYFANDYRW 540 
NQFARHYQTG QPLPKNMVSR LCESKKVCAA ADMQLQVFYA TLDQIYHGKH PLRNSTTDIL 600 
KETQEKFYGL PYVPNTAWQL RFSHLVGYGA RYYS YLMSR A VASMVWKECF LQDPFNRAAG 660 
ERYRREMLAH GGGREPMLMV EGMLQKCPSV DDFVSALVSD LDLDFETFLM DSE 



SEQ U) NO:163 PEZ8 DNA SEQUENCE 

Nucleic Add Accession f: AF 103907 

Coding sequence: . none (underlined sequences correspond to start and stop codons) 



1 II 21 31 41 51 
I I I I I I 

ACAGAAGAAA TAGCAAGTGC CGAGAAGCTG GCATCAGAAA AACAGAGGGG AGATTTGTGT 60 
GGCTGCAGCC GAGGG AGACC AGGAAGATCT GCATGGTGGG AAGG ACCTGA TG ATACAGAG 1 20 
GAATTACAAC ACATATACTT AGTGTTTCAA TGA ACACCAA GATAAATAAG TGAAG AGCTA 1 80 
GTCCGCTGTG AGTCTCCTCA GTGACACAGG GCTGGATCAC CATCGACGGC ACTTTCTGAG 240 
TACTCAGTGC AGCAAAGAAA GACTACAGAC ATCTCAATGG CAGGGGTGAG AAATAAGAAA 300 
GGCTGCTG AC TTTACCATCT GAGGCCACAC ATCTGCTGAA ATGGAGATA A TTAACATCAC 360 
TAGAAACAGC AAGATGACAA TATAATGTCT AAGTAGTG AC ATGTTTTTGC ACATTTCCAG 420 
CCCCTTTAAA TATCCACACA CACAGG AAGC ACAAAAGGAA GCACAGAGAT CCCTGGGAGA 480 
AATGCCCGGC CGCCATCTTG GGTCATCG AT GAGCCTCGCC CTGTGCCTGG TCCCGCTTGT 540 
GAGGGAAGGA CATTAGAAAA TGAATTGATG TGTTCCTTAA AGGATGGGCA GGAAAACAGA 600 
TCCTGTTGTG GATATTTATT TGAACGGGAT TACAGATTTG AAATGAAGTC ACAAAGTGAG 660 
CATTACCAAT GAGAGGAAAA CAGACG AGAA AATCTTGATG GCTTCACAAG ACATGCAACA 720 
AACAAAATGG AATACTGTGA TGACATGAGG CAGCCA AGCT GGGGAGG AG A TAACCACGGG 780 
GCAGAGGGTC AGGATTCTGG CCCTGCTGCC TAAACTGTGC GTTCATAACC AAATCATTTC 840 
ATATTTCTAA CCCTCAAAAC A AAGCTGTTG TAATATCTGA TCTCTACGGT TCCTTCTGGG 900 
CCCAACATTC TCCATATATC CAGCCAC ACT CA TTTTTA AT ATTTAGTTCC CAGATCTGTA 960 
CTGTGACCTT TCTACACTGT AGA ATAAC AT TACTCATTTT GTTCAAAGAC CCTTCGTGTT 1020 
GCTGCCTAAT ATGTAGCTG A CTGTTTTTCC TAAGG AGTGT TCTGGCCCAG GGGATCTGTG 1080 
AACAGGCTGG GAAGCATCTC AAGATCTTTC CAGGGTTATA CTTACTAGCA CACAGCATGA 1 140 
TCATTACGGA GTGAATTATC TAATCAACAT CATCCTCAGT GTCTTTGCCC ATACTGAAAT 1200 
TCATTTCCCA CTTTTGTGCC C ATTCTCAAG ACCTCAAAAT GTCATTCCAT TAATATCACA 1260 
GGATTAACTT TTTTTTTTAA CCTGG A AGAA TTCAATGTTA CATGCAGCTA TGGGAATTTA 1320 
A TTACATATT TTGTTTTCCA GTGCAAA GAT G ACTAAGTCC TTTATCCCTC CCCTTTGTTT 1 380 
GATTTTTTTT CCAGTATAAA GTT A AAA TG C TTAGCCTTGT ACTGAGGCTG TATACAGCAC 1440 
AGCCTCTCCC CATCCCTCCA GCCTTATCTG TCATCACCAT CAACCCCTCC CATACCACCT 1500 
AAACAAAATC TAACTTGTAA TTCCTTGAAC ATGTCAGGAC ATACATTATT CCTTCTGCCT 1560 
GAGAAGCTCTTCCTTGTCTC TTAAATCTAG AATGATGTAA AGTTTTGAAT AAGTTGACTA 1620 
TCTTACTTCA TGCAAAGAAG GGACACATAT GAGATTCATC ATCACATGAG ACAGCAAATA 1680 
CTAAAAGTGT AATTTGATTA TAAG AGTTTA GATAAATATA TG AAATGCAA GAGCCACAGA 1740 
GGGAATGTTT ATGGGGCACG TTTGTAAGCC TGGGATGTGA AGCAAAGGCA GGGAACCTCA 1 800 
TAGTATCTTA TATAATATAC TTCATTTCTC TATCTCTATC ACAATATCCA ACAAGCTTTT 1860 
CACAGAATTC ATGCAGTGCA AATCCCCAAA GGTAACCTTT ATCCATTTCA TGGTGAGTGC 1920 
GCTTTAGAAT TTTGGCAAAT CATACTGGTC ACTTATCTCA ACTTTGAGAT GTGTTTGTCC 1980 
TTGTAGTTAA TTGAAAGAAA TAGGGCACTC TTGTGAGCCA CTTTAGGGTT CACTCCTGGC 2040 
AATAAAGAAT TTACAAAG AG CTACTCAGGA CCAGTTGTTA AGAGCTCTGT GTGTGTGTGT 2100 
GTGTGTGTGT GAGTGTACAT GCCAAAGTGT GCCTCTCTCT CTTGACCCAT TATTTCAGAC 2160 
TTAAAACAAG CATGTTTTCA AATGGCACTA TGAGCTGCCA ATGATGTATC ACCACCATAT 2220 
CTCATTATTC TCCAGTAAAT GTG ATAATAA TGTCATCTGT TAACATAAAA AAAGTTTGAC 2280 
TTCACAAAAG CAGCTGG AAA TGGACAACCA CAATATGCAT AAATCTAACT CCTACCATCA 2340 
GCTACACACT GCTTGACAT A TATTGTTAGA AGCACCTCGC ATTTGTGGGT TCTCTTAAGC 2400 
AAAATACTTG CATTAGGTCT CAGCTGGGGC TGTGCATCAG GCGGTTTGAG AAATATTCAA 2460 
TTCTCAGCAG AAGCCAGAAT TTG AATTCCC TCATCTTTTA GGAATCATTT ACCAGGTTTG 2520 
GAG AGGATTC AGACAGCTCA GGTGCTTTCA CTAATGTCTC TGAACTTCTG TCCCTCTTTG 2580 
TGTTCATGGA TAGTCCAATA AATAATGTTA TCTTTGAACT GATGCTCATA GGAGAG AATA 2640 
TAAG AACTCT G AGTG ATATC AACATTAGGG ATTCAAAGAA ATATTAGATT TAAGCTCACA 2700 
CTGGTCAAAA GGAACCAAGA TACAAAG A AC TCTGAGCTGT CATCGTCCCC ATCTCTGTGA 2760 
GCCACAACCA ACAGCAGGAC CCAACGCATG TCTGAGATCC TTAAATCAAG GAAACCAGTG 2820 
TCATGAGTTG AATTCTCCTA TTATGGATGC TAGCTTCTGG CCATCTCTGG CTCTCCTCTT 2880 
GACACATATT AGCTTCTAGC CTTTGCTTCC ACG ACTTTTA TCTTTTCTCC AACACATCGC 2940 
TTACCAATCC TCTCTCTGCT CTGTTGCTTT GGACTTCCCC ACAAGAATTT CAACGACTCT 3000 
CAAGTCTTTT CTTCCATCCC CACCACTAAC CTG AATGCCT AGACCCTTAT TTTTATTAAT 3060 
TTCCAATAGA TGCTGCCTAT GGGCTATATT GCTTTAG ATG AACATTAGAT ATTTAAAGCT 3 1 20 
CA AG AGGTTC AAAATCCA AC TCATTATCTT CTCTTTCTTT CACCTCCCTG CTCCTCTCCC 3 1 80 
TATATTACTG ATTGCACTG A ACAGCATGGT CCCCAATGTA GCCATGCAAA TGAGAAACCC 3240 
AGTGGCTCCT TGTGGTACAT GCATGCAAGA CTGCTGAAGC CAGAAGGATG ACTGATTACG 3300 
CCTCATGGGT GGAGGGG A CC ACTCCTGGGC CTTCGTGATT GTCAGGAGCA AGACCTGAGA 3360 
TGCTCCCTGC CTTCAGTGTC CTCTGCATCT CCCCTTTCTA ATGAAGATCC ATAGAATTTG 3420 
CTACATTTGA GAATTCCAAT TAGGAACTCA CATGTTTTAT CTGCCCTATC AATTTTTTAA 3480 
ACTTGCTGAA AATTAAGTTT TTTCAAAATC TGTCCTTGTA AATTACTTTT TCTTACAGTG 3540 
TCTTGGCATA CTATATCAAC TTTGATTCTT TGTTACAACT TTTCTTACTC TTTTATCACC 3600 
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AAAGTGGCTT TTATTCTCTT TATTATTATT ATTTTCTTTT ACTACTATAT TACGTTGTTA 3660 
TTATTTTGTT CTCTATACTA TCAATTTATT TG ATTTAGTT TCAATTTATT TTTATTGCTG 3720 
ACTTTTAAAA TAAGTGATTC GGGGGGTGGG AGAACAGGGG AGGGAGAGCA TTAGGACAAA 3780 
TACCTAATGC ATGTGGG ACT TAAAACCTAG ATG ATGGGTT G ATAGGTGCA GCAAACCACT 3840 
3 ATGGCACACG TATACCTGTG TAACAAACCT ACACATTCTG CACATGTATC CCAG AACGTA 3900 
AAGTAAAATT TAA AAAAAAG TGA 

A P^Prpfcln sequence; 
IV Protein Accession #: none 

SEO ID MO:1S4 PEZ6 DNA SEQUENCE 

Nudeic Acid Accession #: AB028945 
j ^ Coding sequencer 1-3765 {underfeed sequences correspond to start and stop codons) 

1 II 21 31 41 51 
I I I t I I 

AJgATGATGA ACGTCCCCGG CGGAGGAGCG GCCGCGGTGA TGATGACGGG CTACAATAAT 60 
GGTCGCTGTC CCCGG AATTC TCTCTACAGT GACTGCATTA TTGAGGAGAA GACGGTGGTC 1 20 

20 CTGCAGAAAA AAG ACAATG A GGGCTTTGGA TTCGTGCTTC GAGGGGCCAA AGCTGACACA 180 
CCCATTGAAG AATTCACACC AACACCGGCT TTCCCAGCCC TACAGTACCT GGAGTCCGTG 240 
GATGAAGGTG GGGTGGCGTG GCAAGCCGGA CTAAGGACCG GGG ACTIO 1 GATTGAGGTT 300 
AACAATGAGA ATGTTGTCAA AGTCGGCCAC AGGCAGGTGG TGAACATGAT CCGGCAGGGA 360 
GGGAATCACC TGGTCCTTAA GGTGGTCACG GTG ACCAGGA ATCTGGACCC CGACGACACC 420 

25 GCCAGG AAG A AAGCTCCCCC GCCTCCAAAG CGGGCACCGA CCACAGCCCT CACCCTGCGC 480 
TCCAAGTCCA TGACCTCGGA GCTGGAGGAG CTCGTGGATA AAGATAAACC CGAGGAGATA 540 
GTCCCGGCCT CCAAGCCCTC CCGCGCTGCT GAG AACATGG CTGTGGAACC GAGGGTGGCG 600 
ACCATCAAGC AGCGGCCCAG CAGCCGGTGC TTCCCGGCGG GCTCAGACAT GAACTCTGTG 660 

0 . TACGAACGCC AAGGAATCGC CGTGATGACG CCCACTGTTC CTGGGAGCCC AAAAGCCCCG 720 

30 TTTCTGGGCA TCCCTCG AGG TACG ATGCGA AGGC AG AAAT CAATAG ACAG CAGAATCTTT 780 
CTATC AGGAA TAACAGAGGA AGAGCGGCAG TTTCTGGCTC CTCCAATGCT G AAGTTCACC 840 
AGAAGCCTGT CCATGCCGGA CACCTOGAG GACATCCCCC CTCCACCGCA GTCTGTGCCC 900 
CCGTCCCCAC CACCACCTTC CCCAACCACT TACA ACTGCC CCAAGTCCCC A ACTCCAAGA 960 
GTCTACGGGA CGATTAAGCC TGCGTTCAAT CAG AATTCTG CCGCCAAGGT GTCCCCCGCC 1020 

35 ACCAGGTCCG ACACCGTGGC CACCATGATG AGGGAGAAGG GGATGTACTT CAGGAGAGAG 1080 
CTGGACCGCT ACTCCTTGGA CTCTGAAGAC CTCTACAGTC GGAATGCCGG CCCGCAAGCC 1 140 
AACTTCCGCA ACAAGAGAGG CCAGATGCCA GAAAACCCAT ACTCAGAGGT GGGGAAGATC 1200 
GCCAGCAAAG CCGTCTACGT CCCCGCCAAG CCCGCCAGGC GG AAGGGGAT GCTGGTGAAG 1260 
. ^ CAGTCCAACG TGGAGGACAG CCCCGAGAAG ACGTGCTCCA TCCCTATCCC GACCATCATC 1320 

40 GTGAAGGAGC CGTCCACCAG CAGCAGCGGC AAGAGCAGCC AGGGCAGCAG CATGGAGATC 1380 
GACCCCCAGG CCCCGGAGCC ACCGAGCCAG CTGCGGCCTG ACG AAAGCCT GACCGTCAGC 1440 
AGCCCCTTTG CCGCCGCCAT CGCCGGAGCC GTCCGCGACC GTGAGAAGCG GCTGGAAGCC 1500 
AGGAGGAACT CCCCGGCCTT CCTCTCCACA GACCTGGGGG ATGAGGATGT GGGCCTGGGG 1560 
. _ CCACCCGCCC CCAGGACGCG GCCCTCCATG TTCCCCGAGG AGGGGGATTT TGCTGACGAG 1620 

45 GACAGCGCTG AGCAGCTGTC ATCCCCCATG CCGAGTGCCA CGCCCAGGGA GCCCGAAAAC 1680 
CATTTCGTGG GTGGCGCCGA GGCCAGTGCT CCGGGTGAGG CTGGGAGGCC GCTGAATTCC 1740 
ACGTCCAAAG CCCAGGGGCC CGAGAGCAGC CCAGCAGTGC CCTCCGCGAG CAGCGGCACA 1800 
GCCGGCCCCG GGAATTATGT CCACCCACTC ACAGGGCGGC TGCTTGATCC CAGCTCCCCG 1 860 
CTGGCCCTGG CACTCTCCGC AAGGGACCGA GCCATGAAGG AGTCTCAACA GGGACCCAAA 1920 

50 GGGGAGGCCC CCAAGGCCGA CCTCAACAAA CCTCTTTACA TTGATACCAA AATGCGGCCC 1980 
AGCCTGGATG CCGGCTTCCC TACGGTCACC AGGCAGAACA CCCGGGGACC CCTGAGGCGG 2040 
CAGGAGACGG AGAACAAGTA CGAGACCGAC CTGGGCCGAG ACCGGAAAGG CGATGACAAG 2100 
AAG AACATGC TG ATCGAC AT CATGGACACG TCCCAGCAGA AGTCGGCTGG CCTGCTGATG 2160 
GTGCACACCG TGGACGCCAC TAAGCTGGAC AACGCCCTGC AGG AAGAGGA CGAGAAGGCA 2220 

55 GAGGTGGAGA TGAAGCCAGA CAGCTCGCCG TCCGAGGTGC CAGAAGGTGT TTCCGAAACC 2280 
GAAGGTGCTT TACAGATCTC CGCTGCCCCC GAGCCCACCA CCGTGCCCGG CAGAACCATC 2340 
GTCGCGGTGG GCTCCATGG A AGAGGCGGTG ATTTTGCCAT TCCGCATCCC TCCTCCCCCT 2400 
CTGGCATCCG TGG ACTTGG A TGAGGATTTT ATTTTTACAG AGCCATTGCC TCCTCCCCTG 2460 
GAATTTGCAA ATAGTTTTGA TATCCCCGAT GACCGGGCAG CTTCTGTCCC GGCTCTCTCA 2520 

60 GACTTAGTGA AGCAGAAGAA AAGCGACACC CCTCAGTCCC CTTCGTTGAA CTCCAGCCAA 2580 
CCAACCAACT CTGCAGACAG CAAGA AGCCA GCCAGTCTTT CAAACTGTCT GCCTGCCTCA 2640 
TTCCTGCCAC CCCCTGAAAG CTTTGACGCC GTCGCCG ACT CTGGGATCGA GGAGGTGG AC 2700 
AGCCGGAGTA GCAGCGACCA CCACCTCGAG ACGACCAGCA CTATCTCCAC CGTGTCTAGC 2760 
A7XTCCACCC TGTCTTCCG A AGGTGGAG AG AATGTGGACA CCTGCACAGT CTATGCAGAT 2820 

65 GGGCAAGCAT TTATGGTTGA CAAACCCCCA GTACCTCCTA AGCCAAAAAT GAAGCCCATC 2880 
ATTCACAA AA GCAATGCACT TTATCAAG AC GCGCTCGTGG AAG AAGATGT AGATAGCTTT 2940 
GTTATCCCCC CCCCCGCTCC CCCGCCCCCG CCGGGCAGTG CCCAGCCTGG GATGGCCAAG 3000 
GTTCTCCAGC CAAGGACCTC CAAGTTGTGG GGCGACGTCA CAG AGATCAA AAGCCCG ATT 3060 
CTCTCAGGCC CAAAGGCAAA CGTTATTAGT GAATTGAACT CTATCCTACA GCAAATGAAC 3 120 

70 CG AG AG AAAT TGGCAAAGCC GGGGGAAGGA CTGG ATTCAC CAATGGGAGC CAAGTCCGCC 3180 
AGCCTCGCTC CAAGAAGCCC GGAGATCATG AGCACCATCT CAGGTACACG GAGCACGACG 3240 
GTCACCTTCA CTGTTCGCCC CGGCACCTCC CAGCCCATCA CCCTGCAGAG CCGGCCCCCC 3300 
GACTATGAAA GCAGGACCTC AGGAACA AGA CGTGCCCCAA GCCCTGTGGT CTCGCCAACA 3360 

_ _ GAGATGAACA AAGAG ACCCT GCCCGCCCCC CTGTCTGCTG CCACCGCCTC TCCTTCTCCC 3420 

75 GCTCTCTCAG ATGTCTTTAG CCTTCCAAGC CAGCCCCCTT CTGGGGATCT ATTTGGCTTG 3480 

AACCCAGCGG GACGCAGTAG GTCGCCATCC CCCTCGATAC TGCAACAGCC AATCTCAA AT 3540 
AAGCCTTTTA CAACTAAACC TGTCCACCTG TGGACTAAAC CAGATGTGGC CGATTGGCTG 3600 
GAAAGTCTAA ACTTGGGTGA ACATAAAG AG GCCTTCATGG ACAATG AGAT CG ATGGCAGT 3660 
CACTTACCAA ACCTGCAGAA GGAGGACCTC ATCGATCTTG GGGTAACTCG AGTCGGGCAC 3720 
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AGAATGAACA TAGAAAGGGC TTTGAAACAG CTGCTGGACA GATAAGGACG GCTGCTCTCC 3780 
ACCTCGCAG A CTGCTCTTGT TATAAGTAGA GATGGGCTCG TGCTG AAACA TCTG AATGCC 3840 
AAGCG AAGTC TGTGAGCATC AACCCCACTC CATGGGTTTG TCTCCTGGTA CCCAAAG AAA 3900 
TACTGAGTTG TGTCCACAAC ATGGCTGGGT CTTCAGACCC CTGGCTCACC ATGTGGGTGT 3960 
CTTGGGCAGT TTCTATCACA CATGGGACAA GGGGAGGGAG TTTTTCTAAC ATGGAAAAAG 4020 
ATTCCCAGCC TGCCGCCC AG CATGCAGGTG GCCTCGCTTT GCCGGGTCCG AGAGGCTCCC 4080 
CGTCAATTTT GCACGGGATC CTAGCTCTTG TAGGCAGACA CCAGTGCACT CTAG ATACCT 4140 
CCTGAGACCT CCGTCCTCTG CTTTCCGGGC AGCTCTCACC ACCCCAGGCC CCGGCATGAG 4200 
GCCrrrCCTC AGTCCTGTGG CCTCTCAGAG G ACACCTGAT GCTCACCTGC CCCTCnTCT 4260 
CCTGCACTTG GCTTGCAGTG AGATGCTCCC AGATGCATTT GTCC AGTGCC CCATCATGGG 4320 
CCTGAAAGGC AGAGA AACTT TTTCCTACAC AGATTCTTTT CCCCATCTCC TCCTGTGGTT 4380 
TGCATCCATG GCTCTTTGGC CATGAGGTTC CTGGCAGTGC TGGGAGTTTG GATGGG ATCG 4440 
TGCCCAGCTT TGCTTAGCTT TCTTTATTTC TGC AAA TCTG TTAGCATAAT TCCAAGGTGG 4500 
CCAAGCAGAT GTCACATGGA GTTAGTCAAA GCACA AAGTC ACGATTCCAC AATGGAGGGG 4560 
AGACCTGGCC AAGGGAGCCA GCCAGCGTGC AACTGCCCAA GCTCCAGGTC TCCAGGACAA 4620 
GAGCAGTTGT CTGCCATG AG CACCCATCC A GGATGG AG AA TAAGGGCTTC TCTGCCTCTC 4680 
AGAATTCTTT TTAATTG AAG ATGTCTTGAG CTCTGCAAAG ATCAGAGCAG GTGAGCATCC 4740 
ACTTTGACAT G AAGGACAAG AAGACGCATG GCTCATGGCG GGCACATGCG GCTGCCAGTG 4800 
AGACAGCGTC TCCTCTGGG A GCTGGGCGGG CACAGCATCC TCAGTTCTGT GCCCAGCCAA 4860 
GGGTG AGCAT CTCTGCTGAG ACAGTCCTTT TGCTCTCGGA GGCCAGGGAA GATGGTACTT 4920 
AGAGGCTTTT CCCCTATCGC TCTGGGTGTC TAGGAATCCC ACCAGCTTGT CTTAACAGTA 4980 
CAACAGCTTC TTTGAGGACC CAGTGGGTAT GGAGTATAGA CAGAACCCAG GGTTG AGAAC 5040 
AGAAGGTGGG CGGCAGGATC AGAGTGAAAG CAGAGGCGTG AGGAGAGGAA AGCAGGGAGG 5100 
TCTCCTGGGC TGCCAGGTCA GCCTCTCTGG CAAGGCTTTC TTGAGCCCCG CCCCTTTCTT 5160 
TCCCCGGAGT CCCTCCACCC CATAACAATA CCTCGAATTT CCAAAAGAGG TCACCAGATG 5220 
CACATGGGCC GCA AA ACACA CAGTC AGGCT TCCAGCACAT TCTCCCCCAT TTGGAGGATA 5280 
CTCGAATGTC AGGTTTTTGG TTTTATTATT ATTTCAGAAC TAGCTCAGCC CATCTCTAAT 5340 
TATAAAACAT GGTTTTGTTT 1 ITS 1 1 1 1 1C CTTTTTTTCT TGATTAGGTC TGGA ACAGCT 5400 
CTAGAATGAA CACATA AAAT TTAGCAATTT AA AATCTTTC TTTACTGCAA GTTTAAATAG 5460 
TTGTACAGAT AGTTTATAAG CACAATATTT TAAG AAAAAA AAGTGGCTGG TCTACTAGGC 5520 
AGCCTTTGTG CCACTTCAGT GCTAGAAAGT TAAAGAAAAA AAAACTTTTG TGATTTAATA 5580 
ATACTATTTC TGTGG AATAA TTATA AAAGT ATGACCTTTT TAAATCAACC TTATTTGGAT 5640 
GCATCTGAAC CAGCAGAGCT GTGTTATATT TTCTATCTTT GCTAGAACTT CGTCATTGAA 5700 
GGACAATTTC TTCAAAGTGG TTACAATTCA TAATGCAGCA GTTTCTCCAA AAACAAAAAC 5760 
AAAACACACA CCACACACAC GCGCTTTTCC AGTCACACAC CCCTG ATGTT GGAACCAAGT 5820 
TTTTGGACCT TCTGTTCCAA AACCTTTTGC AGGTCAATCT TTGTATTTG A AATG ATCCAA 5880 
TCCAACTTG A AGTCAATTG A ATATTAAGGC GCTTTACTTC CGTGTGCTTT CAGTTTTTCC 5940 
ATCATG AGAT G AATGAGCAT TACTCTAG AT AAATTTCAAG ACAGGATACT ACAGGTGGCC 6000 
TGCTGAGGCT GCCCCATATT TTAG AAA ATG TAAAAATGGT GGTTTGGCCA TTAATTTGTC 6060 
TTCCATTTGA TGATACCGCA AAATTCCGTG AGTCCATTCC TTTGGCATGG CACTTTCCCT 6120 
GGGCCTACAG TTGGTATTAC CTCTGTGCTC AGTGCCAGGC AAAACACTAG CTCAAAGGAG 6180 
AGTCAAGGAA ACCGCTGGCA GACGATAACC AGTCG AAACT CGTG ACTTCG GTTTGTTGAA 6240 
CTTTGGCAGC CAGTTGGTGA GGGCCAGATG TT ATTCCCTT TCTTAAAG AT ACTCCAAGCC 6300 
ACATGCCACT AACCACAAGC AAGCTGGCTG CAAGACTAAA GAGCTGATAA CATAGTTTAT 6360 
TTTTACACTG TCTTATTATA G AG AAGTAAT AGACCTATCA GAACCTGCAC TGACCAACAA 6420 
ATAAACACAT GTTGCCAAG A TGAATCGGTC TCTATCTCTA TCTG CTT ATT TTGGTACTGA 6480 
AAGCAATAGT TCCTCATTCA AATC ACC ACC CACTGTTCTC CCCCTTTGGG ACATGTTAGG 6540 
ACGAGGCCCT ATTCCATGCC CCTCTTTAAT GGTGGAACAA ATGTTAAACT GCTCATCTAA 6600 
AGATCATGTT GATATTATTC CAGGTTTTAA GATCAACTTT TGTTACATAC TGTAATTTAA 6660 
ATAAACTGCA TTTACATGCC TAGTTTCTGT AATATTGTGT ATACAAAACC CAAAT CTCTC 6720 
AAAATGTAAA TTATGTATAC CTGCCAAGAT ACCTTTTCCA GGGTGTCTGC GCACATTTTA 6780 
AGTTAATTCA CATAATATAA AAATTACTCA ATGTGACTGT TGATTTGCTG AACTTTACAT 6840 
ATCACAAAGT GAATTATTTG TG ATACTTTA GTTAATAAAA TGGTAAATTT TTTTCTC AGT 6900 
TATTGAACAA GCAAGCATTA TCCAGTTGAT CTGGCAATGA CTTTTTGTGT GTGGGCCACA 6960 
ATATTG ATTT TCCCATTAAC AATTTTTTTT TGTTTTTTAA ATACTAATAT GTTTCACACT 7020 
ATAGTTTGTG TAACAACACG TGTTCGCATT ATCTATGTTG CTGTTACTTT TGTGCTTTTA 7080 
TTCTTTTTAG ACTTTATAAA AAAAAAAAAA AGCTCCTGTA ATTTGCACTT TCTCCCAATC 7140 
CTTAAATCTC TTGTATGGCA ACCAAAATTA CTGTAAAAAA ATAAATATAC TATTGC ACTA 7200 
AGGTTGTGGT TCTGATTGCA AACAAACAGT GAACACTGTC TGAATTAAAC AAAAAGCTGC 7260 
CCGACTTGCA ATCTAATGTA GATTATCTCA GGCATTGTGG CCAGCTCTGC CTCTCTAAAA 7320 
CTGACCAGAA AAATCTCTCT CATCGAGTAA ACAGGCTCCT GTCACTGAGC TAATCTGCCT 7380 
TGGTTCCATT TCCTTATTCT CAATTTATCA ATGGATACGT GCATGTTATT TCAGAATTAT 7440 
GCAAAACGTC AAAATCTGCT TCTGTG ACCG CTGCTATAGG CGTGG AGCTG AGGCTCGGCT 7500 
TTTCCTTTTG TTCTGGGTGG AAGCAGCGGT GCCGCGGAGG GCCAGCCAGA TCCGGACCCT 7560 
TCCCTTAGGG TCCAGTCTCC CCACACCCCA GCAGGGTGTC TTCTAGCCAT AAGGCCAAGG 7620 
GAGTGGCAG A ACTGGGCCGC CTCTCTGGTT GACAAGCAAA CCACATGCTA AGGCTTGG AG 7680 
CAAGAGAGAA TTTGTGTCTA TTGGCAAAGA ACTAAGCCAG GAAGACATGG GCCATCCCTC 7740 
CGCTTTAGGG AAGCATATTT TAAACCTAAA CGTTGAACTT CTTCTTTGGC CTC ACCAGTG 7800 
AAAACTTGTT GTCTTTAGTT CCTAAAGTTT CTTCTACTTT GGCACATTCC CCAGTTG AGC 7860 
AGCAGCCTCT ATGCTTCCAC GTTCAGGAAA AATTCCAGTC CTCATATCTT TTGTAGTTCA 7920 
CCCTCAAGCT CTCCCGCTTC ACCATCCAAT AGTTTCTCCC AAACCTTGGC ACCCCCCTAG 7980 
ACTTTGCTTC CAATGGTTTC TTCCAGACCA CTTTTCCTAG ATGAATATAT TCCTT TACCT 8040 
TACTAGGAAA ATTATTGGAA GATTTTTTCT TTTACTTGAA ATTGGAGGCA TTTTAATAAC 8100 
TCGCGAACTG GAATGTGTTT CTGTATTTGT AG ACAACCAT GTACCCATGC AAGTAGGTGA 8 1 60 
ACATTCCACA GTGGCTGGGT GACCACAGCA GCTGCATGCA GACAGGACTG CCCGTGCTTT 8220 
GTCGCGAATC AGAGAATTTC CAAACTTGTT TCTCAGACTT CCGCAGATCT CATCACTTTG 8280 
ATTTCTAATC CATGCTCTAT TGGTGATTTT GTTTATCGTT CCTGTAACTT GTTCTAC ATT 8340 
CCACAGTCTT TACCGTTTTA TGTTC AAAAT TACAACAATC CCTGTCCATT GATTCCACTC 8400 
TGGAACTCTT TGTTCATGCC AATTTTG AAA TTTTAATACG AGCCTTCAAA TAAACACAGA 8460 
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AAAGAAAAAA AAA AA AAAAA AAAAAAAA 



SEQ ID NfrISS P EZ6 Protein sequence: 
Protein Accession f : BAA82974.1 



I U 21 31 41 SI 
I I I 1 I I 

MMMNVPGGGA AAVMMTGYNN GRCPRNSLYS DC1IEEKTW LQKKDNEGFG FVLRGAKADT 60 
PIEEFTPTPA FPALQYLESV DEGGVAWQAG LRTGDFLIEV NNENVVKVGH RQWNMKQG 120 
GNHLVLKVVT VTRNLDPDDT ARKKAPPPPK RAPTTALTLR SKSMTSELEE LVDKDKPEEI 180 
VPASKPSRAA ENMAVEPRVA TIKQRPSSRC FPAGSDMNSV YERQG1AVMT PTVPGSPKAP 240 
FLGIPRGTMR RQKSDDSR1F LSGITEEERQ FLAPPMLKFT RSLSMPDTSE DIPPPPQSVP 300 
PSPPPPSPTT YNCPKSPTPR VYGTIKPAFN QNSAAKVSPA TRSDTVATMM REKGMYFRRE 360 
LDRYSLDSED LYSRNAGPQA NFRNKRGQMP ENPYSEVGKJ ASKAVYVPAK PARRKGMLVK 420 
QSNVEDSPEK TCSIP1PTU VKEPSTSSSG KSSQGSSMEI DPQAPEPPSQ LRPDESLTVS 480 
SPFAAAIAGA VRDREKRLEA RRNSPAFLST DLGDEDVGLG PPAPRTRPSM FPEEGDFADE 540 
DSAEQLSSPM PSATPREPEN HFVGGAEASA PGEAGRPLNS TSKAQGPESS PAVPSASSGT 600 
AGPGNYVHPL TGRLLDPSSP LALALS ARDR AMKESQQGPK GEAPKADLNK PLYIDTKMRP 660 
SLDAGFPTVT RQNTRGPLRR QETENKYETD LGRDRKGDDK KNMUDIMDT SQQKSAGLLM 720 
VHTVDATKLD NALQEEDEKA EVEMKPDSSP SEVPEGVSET EGALQISAAP EPTTVPGRTI 780 
VAVGSMEEAV ILPFR1PPPP LASVDLDEDF IFTEPLPPPL EFANSFDIPD DRAASVPALS 840 
DLVKQKKSDT PQSPSLNSSQ PTNSADSKKP ASLSNCLPAS FLPPPESFDA VADSGDEEVD 900 
SR5SSDI DELE TTST1STVSS ISTLSSEGGE NVDTCTVYAD GQAFMVDKPP VPPKPKMKPI 960 
MKSNALYQD ALVEEDVDSF VIPPPAPPPP PGSAQPGMAK VLQPRTSKLW GDVTEIKSPI 1020 
LSGPKANV1S ELNSDjQQMN REKLAKPGEG LDSPMGAKSA SLAPRSPEtM STISGTRSTT 1080 
VTFTVRPGTS QPITLQSRPP DYESRTSGTR RAPSPVVSPT EMNKETLPAP LSAATASPSP 1140 
ALSDVFSLPS QPPSGDLFGL NPAGRSRSPS PS1LQQPISN KPFTTKPVHL WTKPDV ADWL 1 200 
ESLNLGEHKE AFMDNEIDGS HLPNLQKEDL IDLGVTRVGH RMNIERALKQ LLDR 



SEO H) N0:16S PEZ4 DNA SEQUENCE 

Nucleic Acid Accession #: NM_000G24 

Coding sequence: 220-1461 (imderfined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

ACTGCGAAGC GGCTTCTTCA GAGCACGGGC TGGAACTGGC AGGCACCGCG AGCCCCTAGC 60 
ACCCGACAAG CTGAGTGTGC AGGACGAGTC CCCACCACAC CCACACCACA GCCGCTGAAT 120 
GAGGCTTCCA GGCGTCCGCT CGCGGCCCGC AGAGCCCCGC CGTGGGTCCG CCCGCTGAGG 180 
CGCCCCCAGC CAGTGCGCTT ACCTGCCAGA CTGCGCGCCATSGGGCAACC CGGGAACGGC 240 
AGCGCCTTCT TGCTGGCACC CAATAG AAGC CATGCGCCGG ACCACGACGT CACGCAGCAA 300 
AGGGACGAGG TGTGGGTGGT GGGCATGGGC ATCGTCATGT CTCTCATCGT CCTGGCCATC 360 
GTGTTTGGCA ATGTGCTGGT CATC ACAGCC ATTGCCAAGT TCG AGCGTCT GCAG ACGGTC 420 
ACCAACTACT TCATCACTTC ACTGGCCTGT GCTGATCTGG TCATGGGCCT GGCAGTGGTG 480 
CCCTTTGGGG CCGCCCATAT TCTTATG AAA ATGTGGACTT TTGGCAACTT CTGGTGCGAG 540 
TTTTGGACTT CCATTGATGT GCTGTGCGTC ACGGCCAGCA TTGAGACCCT GTGCGTGATC 600 
GCAGTGGATC GCTACTTTGC CATTACTTCA CCTTTCAAGT ACCAGAGCCT GCTGACCAAG 660 
AATAAGGCCC GGGTGATCAT TCTGATGGTG TGGATTGTGT CAGGCCTTAC CTCCTTCTTG 720 
CCCATTCAGA TGCACTGGTA CCGGGCCACC CACCAGGAAG CCATCAACTG CTATGCCAAT 780 
GAGACCTGCT GTG ACTTCTT CACG AACCAA GCCTATGCCA TTGCCTCTTC CATCGTGTCC 840 
TTCTACGTTC CCCTGGTGAT CATGGTCTTC GTCTACTCCA GGGTCTTTCA GGAGGCCAAA 900 
AGGCAGCTCC AGAAGATTGA CAAATCTGAG GGCCGCTTCC ATGTCCAGAA CCTTAGCCAG 960 
GTGGAGCAGG ATGGGCGGAC GGGGCATGGA CTCCGCAGAT CTTCCAAGTT CTGCTTGAAG 1020 
GAGCACAAAG CCCTCAAGAC GTTAGGCATC ATCATGGGCA CTTTCACCCT CTGCTGGCTG 1080 
CCCTTCTTCA TCGTTAACAT TGTGCATGTG ATCCAGGATA ACCTCATCCG TAAGGAAGTT 1 140 
TACATCCTCC TAAATTGGAT AGGCTATGTC AATTCTGGTT TCAATCCCCT TATCTACTGC 1200 
CGGAGCCCAG ATTTCAGGAT TGCCTTCCAG GAGCTTCTGT GCCTGCGCAG GTCTTCTTTG 1260 
AAGGCCTATG GGAATGGCTA CTCCAGCAAC GGCAACACAG GGGAGCAGAG TGGATATCAC 1320 
GTGGAACAGG AGAAAGAAAA TAAACTGCTG TGTGAAGACC TCCCAGGCAC GGAAGACTTT 1380 
GTGGGCCATC AAGGTACTGT GCCTAGCGAT AACATTGATT CACAAGGGAG GAATTGTAGT 1440 
ACAAATG ACT CACTGCT GTA A AGCAGTTTT TCTACTTTTA A AGACCCCCC CCCCCCCA AC 1500 
AGAACACTAA ACAGACTATT TAACTTGAGG GTAATAA ACT TAGAATAAAA TTGTAAAAAT 1560 
TGTATAGAGA TATGCAGAAG GAAGGGCATC CTTCTGCCTT TTTTATTTTT TTAAGCTGTA 1620 
AAAAGAGAGA AAACTTATTT GAGTGATTAT TTGTTATTTG TACAGTTCAG TTCCTCTTTG 1680 
CATGGAATTT GTAAGTTTAT GTCTAAAGAG CTTTAGTCCT AGAGGACCTG AGTCTGCTAT 1740 
ATTTTCATGA CTTTTCCATG TATCTACCTC ACTATTCAAG TATTAGGGGT AATATATTGC 1800 
TGCTGGTAAT TTGTATCTGA AGGAGATTTT CCTTCCTACA CCCTTGG ACT TG AGGATTTT 1 860 
GAGTATCTCG GACCTTTCAG CTGTGAACAT GGACTCTTCC CCCACTCCTC TTATTTGCTC 1920 
ACACGGGGTA TTTTAGGCAG GGATTTGAGG AGCAGCTTCA GTTGTTTTCC CGAGCAAAGG 1980 
TCTAAAGTTT ACAGTAAATA AAATGTTTGA CCATG 



SEQ ID NO:167Pg4Prrte?n sequencer 
Protein Accession #: KP.000015.1 



1 11 21 31 41 51 
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1 I I I I I 

MGQPGNGSAF LLAPNRSHAP DHDVTQQRDE VWVVGMGIVM SUVLAIVFG NVLVITAIAK 60 
FERLQTVTNY F1TSLACADL VMGLAWPPG AAHILMKMWT FCNFWCEFWT SIDVLCVTAS 120 
ETLCVIAVD RYFAXTSPFK YQSLLTKNKA RVHLMVWIV SGLTSFLPIQ MHWYRATHQE 180 
AINCYANETC CDFFTNQAYA IASS1VSFYV PLVIMVFVYS RVFQEAKRQL QKIDKSEGRF 240 
HVQNLSQVEQ DGRTGHGLRR SSKFCLKEHK ALKTLGUMG TFTLCWLPFF IVNIVHVIQD 300 
NURKEVYIL LNWIGYVNSG FNPUYCRSP DFRIAFQELL CLRRSSLKAY GNGYSSNGNT 360 
GEQSGYHVEQ EKENKLLCED LPGTEDFVGH QGTVPSDNID SQGRNCSTND SIX 



SEQ ID N0:16B PEZ1 DNA SEQUENCE 

Nucleic Acid Accession *: NM_004457 

Coding sequence: 143-2305 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I i I I I I 

GAATTCGTTG TTGGG AAGGA CTGGGGA AAC AGCTGTAACA TTTGCCACCC TCAGAAGCTG 60 
CTGGTCCTGT GTCACACCAC CTTAGCCTCT TG ATCGAGG A AGATTCTCGC TGAAGTCTGT 1 20 
TAATTCTACT TTTTGAGTAC TTATGAATAA CCACGTGTCT TCAAAACCAT CTACCATGAA 1 80 
GCTAAAACAT ACCATCAACC CTATTCTTTT ATATTTTATA CATTTTCTAA TATCACTTTA 240 
TACTATTTTA ACATACATTC CGTTTTATTT TTTCTCCGAG TCAAGACAAG AAAAATCAAA 300 
CCGAATTAAA GCAA AGCCTG TAAATTCAAA ACCTG ATTCT GCATACAG AT CTGTTAATAG 360 
TTTGGATGGT TTGGCTTCAG TATTATACCC TGGATGTG AT ACTTTAGATA AAGTTTTTAC 420 
ATATGCAAAA AACAAATTTA AGAACAAAAG ACTCTTGGG A ACACGTGAAG TTTTAAATGA 480 
GGAAGATGAA GTACAACCAA ATGGAAAAAT TTTTAAAAAG GTTATTCTTG GACAGTATA A 540 
TTGGCTTTCC TATGAAG ATG TCTTTGTTCG AGCCTTTAAT TTTGGAAATG GATTACAG AT 600 
GTTGGGTCAG AAACCAAAGA CCAACATCGC CATCTTCTGT GAGACCAGGG CCGAGTGG AT 660 
G ATAGCTGCA CAGGCGTGTT TTATGTATAA TTTTCAGCTT GTTACATTAT ATGCCACTCT 720 
AGG AGGTCCA GCCATTGTTC ATGCATTAAA TG AAAC AGAG GTGACCAACA TCATTACTAG 780 
TAAAG AACTC TTACAAACAA AGTTGAAGGA TATAGTTTCT TTGGTCCCAC GCCTGCGGCA 840 
CATCATCACT GTTGATGGAA AGCCACCGAC CTGGTCCGAC TTCCCCAAGG GCATCATTGT 900 
GCATACCATG GCTGCACTGG AGGCCCTGGG AGCCAAGGCC AGCATGGAAA ACCAACCTCA 960 
TAGCAAACCA TTGCCCTCAG ATATTGCAGT AATCATGTAC ACAAGTGG AT CCACAGG ACT 1020 
TCCAAAGGGA GTCATGATCT CACATAGTAA CATTATTGCT GGTATAACTG GGATGGCAGA 1080 
AAGG ATTCCA G AACTAGG AG AGG AAGATGT CTAC ATTGG A TATTTGCCTC TGGCCCATGT 1 140 
TCTAGAATTA AGTGCTGAGC TTGTCTGTCT TTCTCACGGA TGCCGCATTG GTTACTCTTC 1200 
ACCACAGACT TTAGCAGATC AGTCTTCAAA A ATT AAA AAA GGAAGCAAAG GGGATACATC 1260 
CATGTTGAAA CCAACACTGA TGGCAGCAGT TCCGGAAATC ATGGATCGGA TCTACAAAAA 1320 
TGTCATGAAT AAAGTCAGTG AAATGAGTAG TTTTCAACGT AATCTGTTTA TTCTGGCCTA 1 380 
TAATTACAAA ATGGAACAGA TTTCAA AAGG ACGTAATACT CCACTGTGCG ACAGCTTTGT 1440 
TTTCCGGAAA GTTCG AAGCT TGCTAGGGGG AAATATTCGT CTCCTGTTGT GTGGTGGCGC 1 500 
TCCACTTTCT GCAACCACGC AGCGATTCAT GAACATCTGT TTCTGCTGTC CTGTTGGTCA 1560 
GGG ATACGGG CTCACTGAAT CTGCTGGGGC TGGAACAATT TCCGAAGTGT GGGACTACAA 1 620 
TACTGGCAGA GTGGGAGCAC CATTAGTTTG CTGTGAAATC AAATTAAAAA ACTGGGAGGA 1680 
AGGTGGATAC TTTAATACTG ATAAGCCACA CCCCAGGGGT GAAATTCTTA TTGGG GGCCA 1740 
AAGTGTGACA ATGGGGTACT ACAAAAATGA AGCAAAAACA AAAGCTGATT TCTCTGAAGA 1800 
TGAAAATGGA CAAAGGTGGC TCTGTACTGG GGATATTGGA G AGTTTGAAC CCGATGGATG 1 860 
CTTAAAG ATT ATTGATCGTA AAAAGGACCT TGTAAAACTA CAGGCAGGGG AATATGTTTC 1920 
TCTTGGGAAA GTAGAGGCAG CTTTGAAGAA TCTTCCACTA GTAGATAACA TTTGTGCATA 1980 
TGCAAACAGT TATCATTCTT ATGTCATTGG ATTTGTTGTG CCAAATCAAA AGGAACTAAC 2040 
TG AACTAGCT CGAAAGAAAG GACTTAAAGG GACTTGGGAG GAGCTGTGTA ACAGTTGTGA 2100 
AATGGAAAAT GAGGTACTTA AAGTGCTTTC CGAAGCTGCT ATTTCAGCAA GTCTGG AAAA 2160 
GTTTGAAATT CCAGTAAAAA TTCGTTTGAG TCCTG AACCG TGG ACCCCTG AAACTGGTCT 2220 
GGTGACAG AT GCCTTCAAGC TGAAACGCAA AGAGCTTAAA ACACATTACC AGGCGGACAT 2280 
TGAGCGAATG TATGGAAGAA AATAATTATT CTCTTCTGGC ATCAGTTTGC TACAGTGAGC 2340 
TCACATCAAA TAGGAAAATA CTTGAAATGC ATGTCTCAAG CTGCAAGGCA AACTCCATTC 2400 
CTCATATTAA ACTATTACTT CTCATG ACGT CACCATTTTT AACTG ACAGG ATTAGTAAAA 2460 
CATTAAGACA GCAAACTTGT GTCTGTCTCT TCTTTCATTT TCCCCGCCAC CAACTTACTT 2520 
TACCACCTAT GACTGTACTT GTCAGTATGA GAATTTTTCT GAATCATATT GGGGAAGCAG 2580 
TGATTTTAAA ACCTCAAGTT TTTAAACATG ATTTATATGT TCTGTATAAT GTTCAGTTTG 2640 
TAACTT TTTA AAAGTTTGGA TGTATAGAGG GATAAATAGG AAATATAAGA ATTGGTTATT 2700 
TGGGGGCTTT TTTACTTACT GTATTTA AAA ATACAAGGGT ATTG ATATGA AATTATGTAA 2760 
ATTTCAAATG CTTATGAATC AAATCATTGT TGAACAAA AG ATTTGTTGCT GTGTAATTAT 2820 
TGTCTTGTAT GCATTTGAGA GAAATAAATA TACCCATACT TATGTTTTAA GAAGTTGAGA 2880 
TCTTGTGAAT ATATGCCTGT CAGTGTCTTC TTTATATATT TATTTTTTAT TAG AAAA AAT 2940 
GAAGTTTGGT TGGTGATGCA TG AAAC AAAA TAGCAAGAGA GGGTTATAGT TTAATAGTAA 3000 
GGGAG ATAAC ACAGCATGTG TAGCACCAGT TGATAATTGG TCTCTAGTAG CTTACTGTCA 3060 
AAATGTTCAA TGAAGTCTTC TGTTCATCTG TTGAAACTAG GAAAATACCC AAACTTAAAT 3120 
GGAAGAATTC TGAAAGAGAG GATAGAATTT AAAGAACAAG AGTATATAAA GTTATTCTTT 3180 
GAATATTTCG TTG ACTATAT GTACATTGAG TTATCTATAT TTGTAAACAA ATTAGTCATG 3240 
GAAAATTATT CTATTCCAAA GTCTCCTTTT AGTCTAG ATA ATCATTATTT CATTTTAAAA 3300 
TTAGTGTTTT TCATAGTTTG CACTGATGCG TGTATGGATG TGTGTGAGTC AGTGGTAGCT 3360 
TATTTAAAAA GCACCTTATC CTTTCTCCCA TA ACCTTTGT ACACTAAAAA ATGAAAGAAT 3420 
TTAG AATGTA TTTGATGATA GCATTCTCAC TAAGACACAT G AG AATTTAA CTTTATAACC 3480 
GCGTG AGTTA AGATTTAATT CATAGGTTTT GATGTCATTG TTGAAGTTAT TTGTAATTCA 3540 
GAAACCTTGC TTGTGTG ATA CATAGTAAGT CTCTTCATTT ATTACTGCTT GCCTGTTGTT 3600 
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ATATCTGCAT TATCAAAAGC A ATAGTGCAC CAATTAAGAT GTGCTCAAAT CAGGACTTAA 3660 
ATCATAGGCA CCACATTTTT CATGTCAGAC TAGTTACTTT GTTGATTCTC AGTTACTGTA 3720 
GGCATCAAAA GGCAAAAATC A 

5 

SEoioKai69 pEzt Ega eia a flMfio fiE 

Protein Accession I: NPjXwm i 

1 U 21 31 41 51 
I I I I I I 

MNNHVSSKPS TMKLKHTINP ILLYFIHFLI SLYT1LTYIP FYFFSESRQE KSNRIKAKPV 60 
NSKPDSAYRS VNSLDGLASV LYPGCDTLDK VFTYAKNKFK NKRLLGTREV LNEEDEVQPN 120 
GKIFKKVILG QYNWLSYEDV FVRAFNFGNG LQMLGQKPKT NIAIFCETRA EWMIAAQACF 1 80 
MYNFQLVTLY ATLGGPAIVH ALNETEVTNI ITSKELLQTK LKDIVSLVPR LRH1ITVDGK 240 
PPTWSDFPKG IIVHTMAAVE ALGA KAS MEN QPHSKPLPSD IAVIMYTSGS TGLPKGVMIS 300 
HSN1IAGITG MAERIPELGE EDVYIGYLPL AHVLELSAEL VCLSHGCRJG YSSPQTLADQ 360 
SSK1KKGSKG DTSMLKPTLM AAVPEIMDRI YKNVMNKVSE MSSFQRNLFI LAYNYKMEQI 420 
SKGRNTPLCD SFVFR KVRSL LGGN1RLLLC GGAPLSATTQ RFMNICFCCP VGQGYGLTES 480 
AGAGT1SEVW DYNTGRVGAP LVCCE1KLKN WEEGGYFNTD KPHPRGEOJ GGQSVTMGYY 540 
KNEAKTKADF SEDENGQRWL CTGDIGEFEP DGCLKIIDRK KDLVKLQAGE YVSLGKVEAA 600 
LKNLPLVDNI CAYANSYHSY V1GFVVPNQK ELTELARKKG LKGTWEELCN SCEMENEVLK 660 
VLSEAA1SAS LEKFEIPVK1 RLSPEPWTPE TGLVTDAFKL KRKELKTHYQ ADIERMYGRK 

SEQ ID N0:170 PCQ7 DNA SEQUENCE 
25 Nudeic Acid Accession* none found 

Coding sequence: 38-1075{underfined sequence corresponds to start and stop codon) 



10 
15 
20 



1 11 21 31 41 SI 

30 | i | III 

AGCAACGACG CCGGGCAGCG GGAGCGGCGG CCGCGCCATG TGGCTGCTGG GGCCGCTGTG 60 

CCTGCTGCTG AGCAGCGCCG CGGAGAGCCA GCTGCTCCCC GGGAACAACT TCACCAATGA 120 

GTGCAACATA CCAGGCAACT TCATGTGCAG CAATGGACGG TGCATCCCGG GCGCCTGGCA 180 

GTGTGACGGG CTGCCTGACT GCTTCGACAA GAGTGATGAG AAGGAGTGCC CCAAGGCTAA 240 

35 GTCGAAATGT GGCCCAACCT TCTTCCCCTG TGCCAGCGGC ATCCATTGCA TCATTGGTCG 300 

CTTCCGGTGC AATGGGTTTG AGGACTGTCC CGATGGCAGC GATGAAGAGA ACTGCACAGC 360 

AAACCCTCTG CTTTGCTCCA CCGCCCGCTA CCACTGCAAG AACGGCCTCT GTATTGACAA 420 

GAGCTTCATC TGCGATGGAC AGAATAACTG TCAAGACAAC AGTGATGAGG AAAGCTGTGA 480 

AAGTTCTCAA GAACCCGGCA GTGGGCAGGT GTTTGTGACT TCAGAGAACC AACTTGTGTA 540 

40 TTACCCCAGC ATCACCTATG CCATCATCGG CAGCTCCGTC ATTTTTGTGC TGGTGGTGGC 600 

CCTGCTGGCA CTGGTCTTGC ACCACCAGCG GAAGCGGAAC AACCTCATGA CGCTGCCCGT 660 

GCACCGGCTG CAGCACCCTG TGCTGCTGTC CCGCCTGGTG GTCCTGGACC ACCCCCACCA 720 

CTGCAACGTC ACCTACAACG TCAATAATGG CATCCAGTAT GTGGCCAGCC AGGCGGAGCA 780 

GAATGCGTCG GAAGTAGGCT CCCCACCCTC CTACTCCGAG GCCTTGCTGG ACCAGAGGCC 840 

45 TGCGTGGTAT GACCTTCCTC CACCGCCCTA CTCTTCTGAC ACGGAATCTC TGAACCAAGC 900 

CGACCTGCCC CCCTACCGCT CCCGGTCCGG GAGTGCCAAC AGTGCCAGCT COCAGGCAGC 960 

CAGCAGCCTC CTGAGCGTGG AAGACACCAG CCACAGCCCG GGGCAGCCTG GCCCCCAGGA 1020 

GGGCACTGCT GAGCCCAGGG ACTCTGAGCC CAGCCAGGGC ACTGAAGAAG TATAAGTCCC 1080 

_ AGTTATTCCA AAGTCCATAT GGGTTAATCT GCTCTGACTT GTTGCCAnC TAACAATTTG 1140 

50 TGCTCATGGG AAGCTCTTTA AGCACCTGTA AGGATGTCTC AAGTTACAGT TTGGGATATT 1200 

AACTATCTCT GCATTCCCCT CCTCCCCCAG ACTTCAGAGA TGTTTTTCTG GCGTCTCAGT 1260 

TGACATGATC TGTTGTGCGT CTTTTCTGTC AGGTCACTCT TCCCTTGGGA CCCGAGATCA 1320 

CACCCTCATT TTTCACATTA TTCTGTTTCT GTTGGAGAGA CAGCATATAA AACAGTATTG 1380 

AAATAGGCTG GGAGAGAGCA ATGTTTCTGT GCTATATTGG ATGCTCAGAA GTGCAGGAGA 1440 

55 CGCTGGACCC AATTCTCTCT GCTGGGTAGT TACCTTATAG CATTTGGGGA TTTGGGTTAG 1500 

ATGATCTAAC CAGGAGGCCA TCACTGGATG GTCACCCCCC CAAAAAAATT CCATTTGAGC 1560 

ATCAAAACCT GCTTTGCACA ATCCTATTTG ATGCCCCCAG TTCAGCAGAG TCAGTGGCCA 1620 

AAGAAAACTT TGGACGTGAG TAACACCCTT CAGCAGTCGC AACGTTATTT TGGTTTTGTG 1680 

^ n AAGGACTCTG AAACCATCTA CCCTGTATAA ATTCTGGCTT TAGAAATTTG CCCAAGAATG 1740 

60 CTCATTCTGA GAGCTTTCCT CAGCAGCATA TATCATCAGC CTCATCCTAA AATAGGCAGG 1800 

GAGCCCCTCC CATGAGTTTA TCCAAGTTCT CAGCTCCTAA AATGCAGGCT GCCAAGACCC 1860 

TACACCTGCC CTGGCTCTAC AGCCACTTAC CTCGTTTCTG GACTGTCACC CTCCCAGCTG 1920 

ACCTGCCCGT AGCCAAGGAA TGAGGACCTA ACTTGAGTTG GCCCAAAGTC TGACCTGGCT 1980 

, GTATGTCCCT GTGGCCCACA CCCAGCCTGT CTTGCTCATT CATGCAGCCT CAACACTGGC 2040 

05 CTCCAAAGTT CCCTTAACAC TTGCAAAGTC CTTTTTACCT GTGCATTTGG ACTTGAGGAC 2100 

ACTGGTTTCT ATCACAGGTG AGAGCCATGT TCAATACCTC CAGCAAGCTC TCCTGGCTCC 2160 

CTGCACTGTG CACGCTCCTC TTCCCAAGGT CCCAATACCA GCACCTCTAG TTAGAGTTAG 2220 

GGTCAGGGTC AGGCCTCTCC CAACATCCCA GTAGTTTCTC CTCTGAGACA CATGGGCAAG 2280 

AGACAATTTG GAGTCAAGAT TTTCCATTTG GATCTATTTT AAATCTTTTA GAAATGCATT 2340 

70 TGAAACAGTG TGTTTGTTTT TTCCCTTCTA GTTAAGGGAC TATTTATATG TGTATAGGAA 2400 

AGCTGTCTCT TTTTTTGTTT TTCCTTTAAC AAGGTCCAAA GAAAGATGCA AAAGGAGATC 2460 

ACACCCTTGC CCCGCTGAGC CCCGTGATAA CAAGTCACTC CAGACTAACC TGTGTGCCAG 2520 

ACATTTGTGC ATTGTTGCAC TTTGAGGTTA TTATTTATCA AGTTCTTGAA GGAAGCAGAA 2580 

AGAGGGACTC CTCTCTCCCT CCCTGTATAG TCTCTATGTT TGTGCTAGTT TTTCTTTTTT 2640 

75 TTCTCTGTGT CCAGTCAGCC ACAGGGCCCG CCTCCCTGCA GGAATAAGGG GTAAAACGTT 2700 

AGGTGTTGTT TGGCAAGAAA CCACACTGAC TGATGAGGGG TAAAATGGAA CCAGGTAGAG 2760 

CCACTCCGGG CAGCTGTCAC CCATTCAGAA CTTCTTTCCG CAGCTGAAGA AATGTTCAGT 2820 

AACCTGTTTG ACGCTAATTA AAACAGAGCC TGCAGGAAGT GGGGCTAAAG TGGCATTCAG 2880 

TGATCCTGTT CTGTAGACTT TTCTTTCTTT TTTTAACCAA ATCCAAAGGA TGTTACAGAA 2940 

370 



WO 02/30268 



PCT/US01/32045 



5 

10 
15 
20 



AAGCTAGCCA 
AACGGAAAGG 
CTGAGTAATC 
CATTTCATCT 
TTTCTGGTGC 
GAGTTAATCT 
AGATAAGGGA 
TGATTTTTTT 
TTTTTGGGGG 
TTTTTAACTC 
AAAATAGTCT 
AGTTACACTG 
TTTTTAAAAA 
GAAAGGTTGT 
TTATACTTTC 
AAMMAAAAAC 
TGGGGCGGCG 
GAAACCCTGG 
GGAAACCGCA 
GGACTCAACC 



CTGGTATTTT 
AACCTAGCTG 
CAATAAAGAA 
CCTGTGAGTC 
TCTGGAAGTT 
CACTCGCTTT 
TGCCTACTAA 
AATGAATGTT 
GAGGGGGTTT 
ATTCCAACCA 
CATCTCTTTT 
TGATGACTGG 
ATGCAACTAA 
GTGTCGTTGC 
TAATAAATTT 
AWYWTTGGGG 
GGGCCCACGT 
CCAAGAAAAA 
GAGTGTTGCG 
AGGAGGACCC 



GTTTTGTTTA 
CCTGTATCTT 
CTTTTGATGA 
AGAAGGGCTT 
GTTTAGAGGA 
TCTGCTTCCA 
TGCTTTTTTA 
TTTAAAAATA 
GTTTTCCAAC 
GGAAGCTTTT 
TTTCTCAAAT 
CCTATTACCT 
GTGGTTAATA 
TTTTTGTGTT 
GCAGTTTCAT 
GGGCTTGGGC 
AGGTACGGCG 
GGTGGCGAGA 
TAAACCACAC 
AAGGGAACCC 



AAAAAAAAAA 
TCATTTTTAA 
CAGCCAGAAT 
TATTTCTCCC 
AAGAATTCTA 
GGCATCTTAG 
AAACAAACAG 
TATAAATAGG 
TCAAGATGGC 
TTATACATTG 
GAGATCCGTG 
GACTCAGCTC 
GTGTGTGACG 
TTGGTTAGGC 
TCTTTCTGTT 
CTCGGAAAAA 
ACCACGCGGG 
ATTCTCCACA 
CCGAAGAGAG 
GATAGAGTAC 



GAAAGAAAGA 
AATAGCACTT 
GTGTTAGAAC 
TTTGATGGGG 
ATTTTAATTA 
GAAAAACAAA 
GGACATTTTT 
ACACCAAAGC 
ACATTAGTGG 
CCTAAATCTA 
TTTTATTTTA 
CCTCTACCTT 
CTCAAAGTTA 

TGTGCAAAWG 
GTTTTTAACA 
CCCAAACGGG 
CCAGAAAAAA 
AACTCAGAAG 
G 



AAGAAAGAAA 
GAGTTATTTT 
TCTGGCTGAA 
CCCCTTCTTC 
ATTGCGCAGT 
TGGTTTTAGT 
ATTATAGATT 
GGCAGGGTTT 
CCAGCAATAT 
CGCCAACCAG 
GCATTAAATT 
GAAATTGACA 
ATGTAAACTG 
TTTTTAATTT 
GWMCTAMARM 
CCACTTCGGG 
ACCCCAGAAG 
ACGCGCCGGG 
CACACAAGCG 



3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3760 
3840 
3900 
3960 
4020 
4060 



25 
30 
35 



SEP ID NO:171 PCQ7 Protein sequence: 
Protein Accession #: 



none found 



l 11 21 31 41 51 

I I I I I I 

MWLLGPLCLL LSSAAESQLL PGNNFTNECN IPGNFMCSNG RCIPGAWQCD GLPDCFDKSD 60 
EKECPKAKSK CGPTFFPCAS GIHCIIGRFR CNGFEDCPDG SDEENCTANP LLC STAR YHC 120 
KNGLCIDKSF ICDGQNNCQD NSDEESCESS QEPGSGQVFV TSENQLVYYP SITYAIIGSS 180 
VIFVLWALL ALVLHHQRKR NNLHTLFVHR LQHPVLLSRL WLDHPHHCN VTYNVNNGIQ 240 
YVASQAEQNA SEVGSPPSYS EALLDQRPAW YDLPPPPYSS DTESLNQADL PPYRSRSGSA 300 
NSASSQAASS LLSVEDTSHS PGQPGPQEGT AEPRDSEPSQ GTEEV 



40 



Nucleic Acid Accession #: 

Coding sequence: 



SEO ID NO:172 PEtf DNA SEQUENCE 
NM_005$56.1 

57-1535 (underlined sequences correspond to start and stopcodons) 



45 
50 
55 
60 
65 
70 
75 



i 
I 

GTCATATTGA 
CTTTGAACTC 
CGGAAAACCC 
CTCAGTACTA 
ACCCCGTCGT 
AGAAAGCACT 
CTGGCCTACT 
CCTCAGGTAC 
GGGAGGACGA 
CATCTCAGAG 
GGGCGGCCTG 
ATGACAGCGG 
ATAAAAAACT 
TAGCCTGCGG 
CGCTCCCGGG 
GAGGCTCCAT 
TTAACAATCC 
ATGGAGCCGG 
AGAACAATGA 
AACCAGTGTG 
CCGGGTGGGG 
TGCTTCTCAT 
CAGCCATGAT 
GAGGGCCTCT 
GTTCTGGCTG 
ACTGGATTTA 
CGTTTTACAA 
ATGATTCAGA 
TGCCATACTG 
CCGCAAGGGG 
GTTGGAGGCT 
TGAGCATGGA 
GGAAAGGGAG 
TCCCCAGCCT 
GATGGTGGCC 
AAGGGGAACA 



11 
I 

ACATTCCAGA 
AGGGTCACCA 
CTATCCCGCA 
CCCGTCCCCC 
CTGCACGCAG 
GTGCATCACC 
CTGGAAGTTC 
CTGCATCAAC 
GAATCGGTGT 
GAAGTCCTGG 
CAGGGACATG 
ATCCACCAGC 
GTACCACAGT 
GGTCAACTTG 
GGCCTGGCCC 
CATCAOCCCC 
ATGGCATTGG 
ATACCAAGTA 
CATTGCGCTG 
TCTGCCCAAC 
GGCCACCGAG 
TGAGACACAG 
CTGTGCCGGC 
GGTCACTTCG 
TGCCAAAGCT 
TCGACAAATG 
GAAAACAATG 
GGTCACTTCA 
TGCAGGCTGC 
TGATGGCCGG 
GCCCCCATTG 
GCTGTCACTT 
ACAGCCAGGT 
ACTTCACAAG 
AGAAATAAAG 
GAAACATTTT 



21 
I 

TACCTATCAT 
CCAGCTATTG 
CAGCCCACTG 
GTGCCCCAGT 
CCCAAATCCC 
TTGACCCTGG 
ATGGGCAGCA 
CCCTCTAACT 
GTTCGCCTCT 
CACCCTGTGT 
GGCTATAAGA 
TTTATGAAAC 
GATGCCTGTT 
AACTCAAGCC 
TGGCAGGTCA 
GAGTGGATCG 
ACGGCATTTG 
CAAAAAGTGA 
ATGAAGCTGC 
CCAGGCATGA 
GAGAAAGGGA 
AGATGCAACA 
TTCCTGCAGG 
AACAACAATA 
TACAGACCAG 
AAGGCAAACG 
GGGCTGGTTT 
TTTTTATTAA 
AGTGGCTCCC 
CTGGTTGTGG 
AGATCTTCCT 
CTCAGCTGCT 
GGCACCTGCA 
GGGATTTTGC 
GGACCAGCCC 
TGTTCTTATG 



31 
I 

TACTCGATGC 
GACCTTACTA 
TGGTCCCCAC 
ACGCCCCGAG 
CATCCGGGAC 
GGACCTTCCT 
AGTGCTCCAA 
GGTGTGATGG 
ACGGACCAAA 
GCCAAGACGA 
ATAATTTTTA 
TGAACACAAG 
CTTCAAAAGC 
GCCAGAGCAG 
GCCTGCACGT 
TGACAGCCGC 
CGGGGATTTT 
TTTCTCATCC 
AGAAGCCTCT 
TGCTGCAGCC 
AGACCTCAGA 
GCAGATATGT 
GGAACGTCGA 
TCTGGTGGCT 
GAGTGTACGG 
GCTAATCCAC 
TGCTTCCCCG 
ACAGTGAACT 
CTGCCCAGCC 
GCACTGGCGG 
GCTGAGTCCT 
GGATGACTTG 
GCGGCTGCCC 
TGATGGGTTC 
TTCATGGGTG 
GGGTGAGAAT 



41 
I 

TGTTGATAAC 
TGAAAACCAT 
TGTCTACGAG 
GGTCCTGACG 
AGTGTGCACC 
CGTGGGAGCT 
CTCTGGGATA 
CGTGTCACAC 
CTTCATCCTT 
CTGGAACGAG 
CTCTAGCCAA 
TGCCGGCAAT 
AGTGGTTTCT 
GATCGTGGGC 
CCAGAACGTC 
CCACTGCGTG 
GAGACAATCT 
AAATTATGAC 
GACTTTCAAC 
AGAACAGCTC 
AGTGCTGAAC 
CTATGACAAC 
TTCTTGCCAG 
GATAGGGGAT 
GAATGTGATG 
ATGGTCTTCG 
TGCATGATTT 
TGTCTGGCTT 
TGCTCTCCCT 
TCAATTGTGG 
TTCCAGGGGC 
AGATGAAAAA 
TCTGGGGCCA 
TTAGAGCCTT 
GTGACGTGGT 
ATAGACAGTG 



51 
I 

AGCAAGATCG 
GGATACCAAC 
GTGCATCCGG 
CAGGCTTCCA 
TCAAAGACTA 
GCGCTGGCCG 
GAGTGCGACT 
TGCCCCGGCG 
CAGATGTACT 
AACTACGGGC 
GGAATAGTGG 
GTCGATATCT 
TTACGCTGTT 
GGTGAGAGCG 
CACGTGTGCG 
GAAAAACCTC 
TTCATGTTCT 
TCCAAGACCA 
GACCTAGTGA 
TGCTGGATTT 
GCTGCCAAGG 
CTGATCACAC 
GGTGACAGTG 
ACAAGCTGGG 
GTATTCACGG 
TCCTTGACGT 
ACTCTTAGAG 
TGGCACTCTC 
AACCCCTTGT 
AAGGAAGAGG 
CAATTTTGGA 
GGAGAGACAT 
CTTGGTAGTG 
AGCAGCCCTG 
AGTCACTTGT 
CCCTTGGTGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



371 



WO 02/30268 



PCTAJS01/32045 



GAGGGAAGCA ATTGAAAAGG AACTTGCCCT GAGCACTCCT GGTGCAGGTC TCCACCTGCA 2220 

CATTGGGTGG GGCTCCTGGG AGGGAGACTC AGCCTTCCTC CTCATCCTCC CTGACCCTGC 2280 

TCCTAGCACC CTGGAGAGTG AATGCCCCTT GGTCCCTGGC AGGGCGCCAA CTTTGGCACC 2340 

ATGTCGGCCT CTTCAGGCCT GATAGTCATT GGAAATTGAG GTCCATGGGG GAAATCAAGG 2400 

ATGCTCAGTT TAAGGTACAC TGTTTCCATG TTATGTTTCT ACACATTGAT GGTGGTGACC 2460 
CTGAGTTCAA AGCCATCTT 



SEQ tD NO:173 PEL3 Protein seouence: 
Protein Accession #: 



NP.005647.1 



1 
I 

HALNSGSPPA 
SNPWCTQPK 
DSSGTCINPS 
GRAACRDMGY 
CLACGVNLNS 
PLNNPWHWTA 
VKPVCLPNPG 
TPAMICAGFL 
TDWIYRQMKA 



11 
I 

IGPYYENHGY 
SPSGTVCTSK 
NWCDGVSHCP 
KNNFYSSQGI 
SRQSRIVGGE 
FAGILRQSFM 
MMLQPEQLCW 
QGNVDSCQGD 
NG 



21 
I 

QPENPYPAQP 
TKKALCITLT 
GGEDENRCVR 
VDDSGSTSFM 
SALPGAWPWQ 
FYGAGYQVQK 
I SGWGATEEK 
SGGPLVTSNN 



31 
I 

TWPTVYEVH 
LGTFLVGAAL 
LYGPNFILQM 
KLNTSAGNVD 
VSLHVQNVHV 
VISHPNYDSK 
GKTSEVLNAA 
N1WWLIGDTS 



41 
I 

PAQYYPSPVP 
AAGLLWKFMG 
YSSQRKSWHP 
IYKKLYHSDA 
CGGSIITPEW 
TKNNDIALMK 
KVLLIETQRC 
WGSGCAKAYR 



51 
I 

QYAPRVLTQA 
SKCSNSGIEC 
VCQDDWNENY 
CSSKAWSLR 
XVTAAHCVEK 
LQKPLTFNDL 
NSRYVYDNLI 
PGVYGNVMVP 



60 
120 
180 
240 
300 
360 
420 
480 



Nudefc Acid Accession #: 



AI694767 
Coding sequence: 



SEQ ID NO:174 PBJ4 DNA SEQUENCE 



130-1086 (underlined sequences correspond to start and stop codons) 



CAGAGAGGCT 
GGGGTCACAC 
AGCTTCTTCA 
ATAGGCCTCC 
TACCTTATTG 
CTGCATGAGC 
ACCTCATCCA 
GATGCTTGTC 
CTGCTGGCCA 
GTACTTACGT 
CTGATGGCAC 
TCCCATTCCT 
AATGTCGTCT 
TCCTTCTCAT 
AAGGCATTTG 
ATTGGATTGT 
TTGGCCAATA 
ACAAAGGAGA 
CC CTAGG TGT 
GTTAACATTT 
ATCCTTCAAA 
GTTTTCTTGC 
TTTTCATTTT 
GAGATAAGAA 
TAAACACAGA 
ACTCCCAACC 
AAATAATTTT 
AGAGTACATT 
ATGGACCCTG 
TTAGTACCCT 
GGGGTCATAC 
GGAAGAACTG 
TTCTARAGGA 
GCAACAGAAC 
AATTACCTGT 
AGAAAGTCTG 
TGATAGGCAG 
TGAAGATAAC 
ACCATGCTTT 
ATCTGACTTA 
ATAGGTTTCA 
TACTAAAACA 
CCTGATATGG 
AATGCCTATT 
TATTGAATGT 
AAAGTGCCTA 
TTCCTTCTGT 
TTAAATTTTA 
GCTCATAAAA 



11 
I 

GTATTTCAGT 
ATTCCTTCCA 
TGATGGTGGA 
CTGGTTTAGA 
CTGTGCTAGG 
CCATGTATAT 
TGCCCAAAAT 
TGCTACAGAT 
TGGCTTTTGA 
TGCCTCGTGT 
CCCTTCCTGT 
ACTGCCTACA 
ATGGCCTTAT 
ATCTGCTTAT 
GCACTTGCGT 
CCATGGTGCA 
TCTATCTGCT 
TTCGACAGCG 
CAGTGATCAA 
TGGAAGACAG 
TATGAAACTG 
TACATATAAT 
ACCATGCAGT 
TGGTACATCT 
ATATAATAAA 
ACATTGGATC 
TCCTCTGGAC 
TACCTACGTT 
TTTTTCCTAT 
CATTGTAGCC 
AAGTATAAAA 
TTAAAGAGAC 
GGTATTTAAT 
TCATGGCTTT 
GTCTTGGAAG 
CATAGGGCTT 
TGAGGTTAGG 
ATTGGCCTTT 
ATTTGGGGCT 
GGCATGGGAA 
TCTTCAACAG 
TGTGATCATA 
ATTCCTATNA 
TAATACTTGT 
CATCTCTGTT 
GAACATAATA 
GCTGAACACA 
GCCATTACTT 
CCCTCCCATG 



21 



GCAGCCTGCC 
TACGGTTGAG 
TCCCAATGGC 
AGAGGCTCAG 
TAACTTGACA 
ATTTCTTTGC 
GCTGGCCATC 
GTTTGCCATC 
CCGCTATGTG 
CACCAAAATT 
CTTCATCAAG 
CCAAGATGTC 
CGTCATCATC 
TCTTAAGACT 
CTCTCATGTG 
TCGCTTTAGC 
GGTTCCTCCT 
CATCCTTCGA 
ACTTCTTTTC 
TATTCAGAAA 
GTTGGGGAAT 
TATTAATACC 
CCAAATCTAA 
AGAGAACATT 
ATGAGATAAT 
TCAGAAAAAT 
ACTAGCACTT 
AATGAAAGTT 
TTAATTTTCT 
ATGGGAAAAT 
ATTAAAAAAA 
CAACAGGGTA 
TTCTTCTCAC 
AATCCCACTA 
AAGTGATTTC 
ATAGCAAGTT 
GAGCCACCAG 
TGAGTGTGAC 
TTGTGCAGTA 
TCAGGCATTT 
GATATGACAA 
TATGTGGTAA 
CATGCTTTCA 
ATTTGCTGCT 
CATCATTGAC 
GTGCTTATGC 
TAGCCAGGCA 
CCAATGTGAG 
TGCAGCCTTT 




ATCATCTACA 
ATGCTTTCAG 
TTCTGGTTCA 
CACTCCTTAT 
GCCATCTGTC 
GGTGTGGCTG 
CAGCTGCCCT 
ATGAAGCTGG 
TCCGCCATTG 
GTGTTGGGCT 
TGTGCTGTGT 
AAGCGGCGTG 
GTGCTCAACC 
CTTTTCCATG 
CATTCAGAGT 
AAAAATTTCC 
CTCCATTTTT 
CTGACTAGGT 
ACTGCTTCTA 
TGCCAAAGGC 
CTAGCTTAAA 
ACTGTCTTCA 
AAGGGGAAGA 
GACACACTGT 
TATCAACCCT 
TGATGTTCAG 
AAAGACTTCA 
GTGGGTTAGA 
TCATCCAGTG 
GCTATTGCTT 
TAGGTTCACC 
ATTTATTTTT 
TTATGATGGG 
TCGTAGCTGG 
TGGAACAGGG 
TTGCTTCTGA 
CAGTCTTAAC 
GTTTCATTTT 
TCCCCTTTTG 
GGACTGTAAG 
TGCTCTTTGC 
TTGACACCGG 
ATTTTCCAGC 
TGGAAGTGAC 
CATGTTGACA 



41 
I 

TGGAGGAAGA 
CCTGGTGCTG 
GTGCTACATA 
CCTTCCCATT 
TTGTGCGGAC 
GCATTGACAT 
ATTCCACTAC 
CTGGCATGGA 
ACCCACTGCG 
CTGTGGTGCG 
TCTGCCGCTC 
CCTGTGATGA 
GCCTGGACTC 
TGACACGTGA 
TCATATTCTA 
ACTCTCCACT 
CAATTGTCTA 
TGGCCACACA 
CCTCTGATTC 
TTAATAAAAA 
TCAATATTAT 
TGTGGTTGGA 
CTGATGGTTT 
CTAAGCACAG 
ACTATAACTT 
AAATGACTTC 
TTGGAAGTAA 
TCTGAGAGTT 
TTAATTAGGC 
TGGGGATCAG 
TGCCCAATCT 
GATTTCCAGA 
TTGTATTTAG 
ATTGTCCTGG 
ATTATGGAAG 
AAAAGTTCCA 
AAGTATGGAA 
AAAGTGAGGG 
ACTTTGAGAC 
GGGGCTATTA 
CAAGAAACTC 
CTTTTTCAAT 
TAATGGATAT 
CCCATGAGGG 
TCATCATTGA 
TTATTTTTCA 
CTTCTTTGAG 
ATGTGCAATT 
TTAAATGTGA 



51 
I 

CTGGACAAAG 
GTCACAGTTC 
CTTCATCCTA 
GTGCTCCCTC 
TGAGCACAGC 
CCTCATCTCC 
CATCCAGTTT 
ATCCACAGTG 
CCATGCCACA 
GGGGGCTGCA 
CAATATCCTT 
TATCCGGGTC 
ACTTCTCATC 
AGCCCAGGCC 
TGTACCTTTC 
GCCCGTCATC 
TGGAGTGAAG 
CGCTTCAGAG 
AGATTTTAAT 
TACAACTCAG 
TTTCTTCTTT 
GGGTTATTAC 
ACAGCATTCT 
CAAAGGAAAA 
CCTCTTCAGA 
TACAGAGAAG 
AGCCTTGAAA 
TTCACAGCAT 
AAAGATATTA 
TGAATTAAAT 
CATATGATGT 
GTCTTACATT 
GAATTTCCTG 
TCCAATTGCC 
ATTCTTATTC 
TAGGTGTTTC 
TGGCAGGTGT 
AATCTTCAGG 
CGGGAAAGCA 
CCAAGGGTTA 
AAATTACATA 
CCTCAGGTTC 
CATATTTGGA 
CACTGTTTAT 
ATCCCCCAGC 
TCAAACCTGA 
TTGGGTATTA 
TTTATACCTG 
CTTGGGAAGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
-1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 



372 



WO 02/30268 
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TATGTGTTAC ACAGAGTTAA TTAACCNGAA AGGCCTGGNA ATTTTTTGNN AANNAAACTG 3000 
TGGCCNNGAG GCCCNCAACC CTTTTTNNNA ATTTGGCAAN NTCCCACTTT GTANTTTGGT 3060 
AAGGAGGCCA GTTGGATAAG TGAAAAATAA AGTACTATTG TGTC 



5 

10 

15 
20 



Protein Accession fc 



l 11 
I I 

MVDPNGNESS ATYFILIGLP 
MYIFLCMLSG IDILISTSSM 
AFDRYVAICH PLRHATVLTL 
CLHQDVMKLA CDDIRVNWY 
TCVSHVCAVF IFYVPFIGLS 
RQRILRLFHV ATHASEP 



Nucleic Acid Accession!: 
Coding sequence: 



SEQ ID Nai7S PBJ4 PROTEIN SEQUENCE 
not available, cloned ai Eos 



21 31 41 51 

1 I I I 

GLEEAQPWLA FPLCSLYLIA V1GNLTIIYI VRTEHSLHEP 60 

PKMLAIFWFN STTIQFDACL LQMFAIHSLS GMESTVLLAM 120 

PRVTKIGVAA WRGAALMAP LPVFIKQLPF CRSNILSHSY 180 

GLIVIISAIG LDSLLISFSY LLILKTVLGL TREAQAKAFG 240 

MVHRFSKRRD SPLPVILANI YLLVPPVLNP IVYGVKTKEI 300 



SEQ ID NO:176 PM72 DNA SEQUENCE 
NMJW4624.1 

57-1 544 (underlined sequences correspond to start and stop codons) 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TCGGAGCCTG 
CTCCTCCTCC 
TGGTGGTCGC 
GCGGCGGCGG i 
CGCTCTTGGG 
ACAAGCAGTG 
GGGACAACCT i 
CCCTCATCTT i 
ACGAAGGCTG i 
AGGCAGCGAG 
CCATTGGCTA i 
TCAGGAAGCT 
TGAGGGCTGC 1 
AGTGCTCCGA < 
TGGCTAACTT 

GCACATTCAC 
GGTGCTGGGA 
CCATCTTGGT 
GGCCCCCAGA 
TCCTGCTGAT 
TTAAGCCTGA 
TGGCTATCCT 
GGCGCTGGCA 
GCAGCAACGG 
CCCGCCGCTC 
CCAAGCGGCC 
GGGCGCGCCA 
GGACACTCCT 
GATGGGAGCT 
AGGCCCCCTA 
TGCTGGCTCT 
TGACCTGAGG 
CCTGAAATTT 
GACTGAAGAT 
GTGGGTTATT 
GTGGACTGGC 
CTGAAGCCTC 
TACCTGCTCT 
TTCTTATCTC 
CACCTATGTG 
AAGCAGATCC 
GTGAAAGCAC 
TTATTTGTTT 
CCCTCCCTGG 
CTGGTCACAG 
CCTCTGCCAG 
GGAAAAAAAA 



CGCCAATCAA 
TCTGCCCAAT 
GCAGAAAGGT 
CACCATTGCT 
GCAGCTCACT 
CTGGAGTTTT 
CCCTGGGTCA 
TGGGAAATGA 
CCAAGTCTCA 
TCTGTGCTGT 
CCAACTGTTG 
TCACCCTGCT 
GGACTCTTAC 
ACCACTTGTA 
AGTGTGGCTG 
CCTCCTCTGT 
AAGATCCCCT 
AAAA 



TGGTGGTGGT 
CTCAGGCGCC 
GCTCGCTCTC 
GTCGCGCGGC 
CAGGAGGAGT 
GCCCAGCTGG 
CCAGCCACCC 
TCCTCCATTC 
GAGCCTGGCC 
CAGCAGACCA 
CTCGCCACCC 
CGGAACTACA 
ATCAAAGACT 
GGCTGTAAGG 
CTGGTGGAGG 
TACTTCTGGG 
ACCATCGCCA 
TCCTCACTGT 
CTGTTTATTT 
AGTGACAGCA 
GGAGTACACT 
GTCTTTGAGC 
CTCAATGGTG 
GTCCTGGGCT 
AGCACGCAGG 
CAAGCCGAAG 
CTTCCCACTC 
TGGGCTCGGA 
GCCCTAGAGC 
AGGATGCAGG 
GGGCAAAAAG 
TGGAGGAAAG 
TCTGCCCGGG 
GTCAAGTTCC 
ACCCTATTCT 
TGTTTCGAGA 
GTCTGGTGGG 
GAAGGCAGCC 
GTGGCTTCAT 
GGAAGCAACA 
TAACTAGGCT 
ACACATACAG 
TGCTAACTTT 
TTATTAATGC 
AGGAGGCCTC 
CTGCCCTTCA 
CAGGACTGCA 



GGTGGTGGCC 
TCGGTGGCGG 
GGGGAGGCCG 
GGAGGCGGCT 
GTGACTATGT 
AGAATGAGAC 
CTCGGGGCCA 
AAGGCCGCAA 
CGTACCCCAT 
TGTTCTACGG 
TTCTGGTCGC 
TCCACATGCA 
TGGCCCTCTT 
CAGCCATGGT 
GCCTCTACCT 
GGTACATACT 
GGATCCATTT 
GGTGGATCAT 
GCATCATCCG 
GTCCATACTC 
ACATCATGTT 
TCGTCGTGGG 
AGGTGCAGGC 
GGAACCCCAA 
TTTCCATGCT 
TCTCCCTGGT 
GCAGCAGACG 
GGCTGCCCCC 
CTGCCTGGAG 
TGGAACTCAG 
TCTACATACT 
CAACCGGTGG 
AAGGTCACCA 
TTTGGGTTAA 
CTCTTTACGC 
GCACACCTAT 
AGGACGGTGC 
ACCAGCGAAT 
CTGTCAAGTG 
GGAATCAAGA 
CAGAGATGTG 
GATTTGAACT 
TGTGTATCGT 
CATTATCCCT 
CATCTCATGT 
CCCCAGTGGC 
ACAGGCTTGT 



CTCGCCCGCC 
TTGGTCGGCG 
GGGCGGATCT 
CGAGCTTCGT 
GCAGATGATC 
AATAGGCTGC 
GGTAGTTGTC 
TGTAAGCCGC 
TGCCTGTGGT 
TTCTGTGAAG 
CACAGCTATC 
CCTCTTCATA 
CGACAGCGGG 
CTTTTTCCAA 
GTACACCCTG 
CATCGGCTGG 
TGAGGATTAT 
AAAGGGCCCC 
AATCCTGCTT 
AAGGCTAGCC 



GTCTTTCCAG 
GGAGCTGAGG 
ATACCGGCAC 
GACCCGCGTC 
CTGACCACCA 
CCGGGGACAG 
GGCCCCCTGG 
CGTTTCTAGC 
TCATTAGACT 
TTCATCCTGA 
ATCCTCAAAC 
GCACCAACAC 
GCATTACCAC 
TTAGTTATCA 
CTTAGTGGTT 
AACCCAAGGA 
GCTAGGTCTC 
GGACTCTGTC 
GACTGCCCTC 
CACCCATGGG 
CAGATCTGTC 
AACCAGCCAG 
GAATTCCCCT 
ATCATCTGGA 
CACTCAGCTT 
GCAACAATAA 



TCACTCATGC 
GTTACGCGGC 
CGCGGCGCAG 
GCTGCGCGCT 
GAGGTGCAGC 
AGCAAGATGT 
TTGGCCTGTC 
AGCTGCACCG 
TTGGATGACA 
ACCGGCTACA 
CTGAGCCTGT 
TCCTTCATCC 
GAGTCGGACC 
TATTGTGTCA 
CTTGCCGTCT 
GGGGTACCCA 
GGTCTGCTCA 
ATCCTCACCT 
CAGAAACTGC 
AGGTCCACAC 
CCGGACAATT 
GGTTTTGTGG 
CGGAAGTGGC 
CCGTCGGGAG 
AGCCCAGGTG 
GGATCCCAGC 
AGGCCTGCCC 
TCTCTGGTCC 
AAGTGAGAGA 
CCTCCTCCAA 
CTCTGCCCCC 
AACACTGGTG 
CACGGTAGTG 
TCAGGCATTT 
GCTTTTTAAA 
CCCCACCGAA 
CTGAGGGACT 
GGACTAAGCC 
ACACCAGCCA 
CTTGTCCACC 
CTCTGACAGA 
TGATAGGAAT 
ATCCTCTTGG 
TGCCACCCCA 
TAGGAGCCTG 
CCTACCCACA 
ATGTTGGCTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 



SEQ ID NO:177 PM72 Protein sequence: 

Protein Accession*: JC2195 

1 11 21 31 41 51 

I I I I I I 

MPPPPLLSLR RLGGGWSAVT RLWAAAGAR SRGGRGGSRG AGGGGRGGVA RRRRLELRAA 60 

RSLLGSSLQE ECDYVQMIEV QHKQCLEEAQ LEKETIGCSK MWDNLTCWPA TPRGQVWLA 120 

373 



WO 02/30268 



PCT/US0 1/32045 



CPLIFKLFSS IQGRNVSRSC TDBGWTHLEP GPYPIACGLD DKAASLDEQQ TMFYGSVKTG 180 

YTIGYGLSLA TLLVATAILS LPRKLHCTRN YIHMHLFISF ILRAAAVFIK DLALFDSGES 240 

DQCSEGSVGC KAAMVFFQYC VMANFFWLLV EGLYLYTLLA VSFPSERKYF WGYILIGWGV 300 

PSTFTMVWTI ARIHFEDYGL LRCWDTINSS LWWIIKGPIL TSILVNFILF ICURILLQK 360 

LRPPDIRKSD SSPYSRLARS TLLLIPLFGV HYIMFAFFPD NFKPEVKMVF ELWGSFQGF 420 

WAILYCFLN GEVQAELRRK WRRWHLQGVL GWNPKYRHPS GGSNGATCST QVSMLTRVSP 480 
GARRSSSFQA EVSLV 



Nucleic Acid Accession #: 

Coding sequence: 



SEQ ID NO:178 BFFB DNA SEQUENCE 

AL133619 

1-2070 (underlined sequences correspond to start and stop codons) 



1 11 21 31 

I I I I 

ATGAGCGGTG CGGGGGTGGC GGCTGGGACG CGGCCCCCCA 
CGGCGCCGGC GCCAGCGCCC CTCTGTGGGC GTCCAGTCCT 
CTCAGGCAGA GCGACCCGCA GAAACGGAAC CTGGACCTGG 
CAGCAGCAGC ACTCGGAGAT GCTGGCCAAG CTCCATGAGG 
GAAAACAAGG GTGAGCCGGC GCGGGGCCCT AGGCCGGCCC 
ACACTGCCGC TCCCGCAGCA CAGAAACACA GCCATCAACT 
GGGGGAACAC AGGACGGGGA GCCCCTCCAG ACTGTCCTTG 
CCTGTATGCC AACCCAGTGG GTACAGGTTC TGGGGGACCT 
AGCCGTGGCT GGACGATGTT ATGCAGCCAA GCACAGCACG 
GGGCCTGAGG TCATTGCAGG GCGGCAGGTG GCCACAGGGT 
CCAA6TAGAG CTGAAATGGG AAGGAACCCC TGGGACAGCC 
CCTCAGATTG CTGCTGTGGC CAGGCCCAGG ATTTCCAGCC 
ATGCTGGGGG CCCAGGGGAT ATGGACACAC TCCATCCAGG 
GCAGCAACCA TGGGGACAAA GGGAGGAAGC AGAGTCCTGT 
GCACTTCCCC ATCCTGACAG CGGCCCCCAC CCAGCCCAGG 
GCTCACTTCC CATTATCTTT GGGGCTGGGG CTGACATCAG 
TGGAGCCAGC CTGGGAACAT CGCAGCTGGG GCAGTGCCTA 
GACATGGAGA AGGGGGTTGA GGGAGGGCCC TTCCCTAGCC 
CTGTTCTGGG CAAAGTGTGG CCCAAGTCGG CAGCCCCAGC 
GACAGGACAC GGG AAGAGGC CATGCTTTCC CTCGGGACCT 
CCCTCCTGCT TTCCAGATGG CCCCTCAGGA AACCACCTTT 
GGCGCTCGCT GGGTCTGCAT CAACGGAGTG TGGGTAGAGC 
AGGCTGAAGG AGGGCTCCTC ACGGACACAC AGGCCAGGAG 
GGCGGTAGCG CCGACACTG* GCGCTCTCCT GCAGACAGCC 
TCTGTCAAGT CCATCTCTAA TTCAGCCAAC TCTCAAGGCA 
TCCTTCAACA AGCAAGATTC AAAAGCTGAC GTCTCCCAGA 
CCCCTACTTC ACAACAGCAA GCTGGACAAA GTTCCTGGGG 
GAGAAAGCAG AGGCCTCTAA TGCAGGAGCT GCCTGTATGG 
AGGCAGATGG GGGCGGGGGC ACACCCCCCA ATGATCCTGC 
ACCACACTTA GGCAGTGCGA AGTGCTCATC CGCGAGCTGT 
ACCCAAGAGC TGCGGCACCT CAAGTCCCTC CTGGAAGGGA 
CCGGAGGAAG CTAGCTTTCC CAGGGACCAA GAAGCCACGC 
AAGAGCCTCT CCAAGAAATG CCTGAGCCCA CCTGTGGCGG 
CTGAAGCAGA CCCCGAAGAA CAACTTTGCC GAGAGGCAGA 
AAACGGCCCC TGCATCGCTC AGTGCTTTGA 




GCTCCCCAGA 
CCTGCCCTGC 
CTATGGCTCT 
GATCCCTTCC 
TTCCTTGCCA 
ATCCTGGGCT 
GAGGACATCT 
GGGCTCTCCC 
GCTGTGGCAA 
CCTGCAGTGC 
GCTGTTCCAT 
CCAGGGCCTC 
CGGGAGGACC 
GCAAGCGTGG 
TCTCCATGTC 
AGGCCAGGCC 
AGGCGGACCT 
TACAAGGGCA 
GGAACAGCCA 
CCCTTCCCCT 
GGAATACCAA 
GCCAGAGGCC 
ATTTCCCCAA 
AGCGTGCCAT 
AGAGGCTGCA 



51 
I 

CCCGGGCTCT 
GAGCCCGCAG 
GCAGTTCCTG 
TCTGAAGCGG 
GGCACACTCA 
CCTGGGCTCA 
TGCACTGGCC 
CGCTACCTCT 
GGGAAGCCCA 
CCTCCCTCCT 
TAGATCTTTG 
GAGTCCTCAC 
TGCCATCTGG 
CTTGTCCAAG 
GTGGTCTCAA 
GACTGGTGGA 
TTCCCAGGGA 
CTCCAGTGAG 
TGGGGACGCT 
GTGTCCCAAG 
TGCTCCCTTG 
CAGCCCTGCC 
GCGTCTTGCG 
AAGCTTCCAG 
CCAGCCCGGC 
GGAAGAGGAG 
GGCCAGAAAG 
GCACCAGGGC 
GCGAAAGCCC 
CCTCCTGCAG 
CCAGGCAGCC 
GGTCTCCACC 
CCTGCCCGCA 
GGCAATGCAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1S00 
1S60 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



SEQ ID NO:179 PFFB Protein sequence; 
Protein Accession #: 



T43457 



1 
I 

MSGAGVAAGT 
QQGHSEMLAK 
GGTQDGEPLQ 
GPEVIAGRQV 
MLGAQGIWTH 
AHFPLSLGLG 
LFWAKCGPSR 
GARWVCINGV 
SVKSISNSAN 
ERAEASNAGA 
TQELRHLKSL 
LKQTPKNNFA 



11 
I 

RPPSSPTPGS 
LHEEIEHLKR 
TVLAHLAALA 
ATGCSPDLPP 
SIQGSLPAIW 
LTSGGHLTGG 
QPQPCSAGDA 
WVEPGGPSPA 
SCGKARPQPG 
ACHGNSQHQG 
LEGSQRPQAA 
ERQKRLCAMQ 



21 
I 

RRRRQRPSVG 



PVCQPSGYRF 
PSRAEMGRNP 
AATMGTKGGS 
WSQPGNIAAG 
DRTREEAMLS 
RLKEGSSRTH 
SFNKQDSKAD 
RQMGAGAHPP 
PEEASFPRDQ 
KRRLHRSVL 



31 
I 

VQSLRPOSPQ 
RPALPPOAHS 
WGTWTDAATS 
WDSPCPARSL 
RVLFPCHLSK 
AVPRALPSQG 
LGTCCSKCPK 
RPGGKRGRLA 
VSQKADLEEE 
HILPLPLRKP 
EATHFPKVST 



41 
I 

LRQSDPQKRN 
TLPLPCHRNT 
SRGWTMLCSQ 
PQIAAVARPR 
ALPHPDSGPH 



PSCFPDGPSG 
GGSADTVRSP 
PLLHNSKLDX 
TTLRQCEVLI 
KSLSKKCLSP 



51 
I 

LDLEKSLQFL 
AINSSTRLGS 
AQHVLLSGSP 
ISSPMALSPH 
PAQDPGLWSQ 
FPSRCGNSSE 
NHLSRASAPL 
ADSLSMSSFQ 
VPGVQGQARK 
RELWNTNLLQ 
PVAERAILPA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



Nudeic Add Accession*: 



SEQ 10 NO:180 6CR4 DNA SEQUENCE 
NM.012319.2 



Coding sequence: 



138-2405 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CTCGTGCCGA ATTCGGCACG AGACCGCGTG TTCGCGCCTG GTAGAGATTT CTCGAAGACA 60 
CCAGTGGGCC CGTGTGGAAC CAAACCTGCG CGCGTGGCCG GGCCGTGGGA CAACGAGGCC 120 



374 



WO 02/30268 



GCGGAGACGA AGGCCCAATG GCGAGGAAGT TATCTGTAAT CTTCATCCTG ACCTTTGCCC 180 

TCTCTGTCAC AAATCCCCTT CATGAACTAA AAGCAGCTGC TTTCCCCCAG ACCACTGAGA 240 

AAATTAGTCC GAATTGGGAA TCTGGCATTA ATCTTCACTT GGCAATTTCC ACACGGCAAT 300 

ATCATCTACA ACAGCTTTTC TACCGCTATG GAGAAAATAA TTCTTTGTCA GTT6AAGGGT 360 

TCAGAAAATT ACTTCAAAAT ATAGGCATAG ATAAGATTAA AAGAATCCAT ATACACCATG 420 

ACCACGACCA TCACTCAGAC CACGAGCATC ACTCAGACCA TGAGCGTCAC TCAGACCATG 480 

AGCATCACTC AGACCACGAG CATCACTCTG ACCATGATCA TCACTCTCAC CATAATCATG 540 

CTGCTTCTGG TAAAAATAAG CGAAAAGCTC TTTGCCCAGA CCATGACTCA GATAGTTCAG 600 

GTAAAGATCC TAGAAACAGC CAGGGGAAAG GAGCTCACCG ACCAGAACAT GCCAGTGGTA 660 

GAAGGAATGT CAAGGACAGT GTTAGTGCTA GTGAAGTGAC CTCAACTGTG TACAACACTG 720 

TCTCTGAAGG AACTCACTTT CTAGAGACAA TAGAGACTCC AAGACCTGGA AAACTCTTCC 780 

CCAAAGATGT AAGCAGCTCC ACTCCACCCA GTGTCACATC AAAGAGCCGG GTGAGCCGGC 840 

TGGCTGGTAG GAAAACAAAT GAATCTGTGA GTGAGCCCCG AAAAGGCTTT ATGTATTCCA 900 

GAAACACAAA TGAAAATCCT CAGGAGTGTT TCAATGCATC AAAGCTACTG ACATCTCATG 960 

GCATGGGCAT CCAGGTTCCG CTGAATGCAA CAGAGTTCAA CTATCTCTGT CCAGCCATCA 1020 

TCAACCAAAT TGATGCTAGA TCTTGTCTGA TTCATACAAG TGAAAAGAAG GCTGAAATCC 1080 

CTCCAAAGAC CTATTCATTA CAAATAGCCT GGGTTGGTGG TTTTATAGCC ATTTCCATCA 1140 

TCAGTTTCCT GTCTCTGCTG GGGGTTATCT TAGTGCCTCT CATGAATCGG GTGTTTTTCA 1200 

AATTTCTCCT GAGTTTCCTT GTGGCACTGG CCGTTGGGAC TTTGAGTGGT GATGCTTTTT 1260 

TACACCTTCT TCCACATTCT CATGCAAGTC ACCACCATAG TCATAGCCAT GAAGAACCAG 1320 

CAATGGAAAT GAAAAGAGGA CCACTTTTCA GTCATCTGTC TTCTCAAAAC ATAGAAGAAA 1380 

GTGCCTATTT TGATTCCACG TGGAAGGGTC TAACAGCTCT AGGAGGCCTG TATTTCATGT 1440 

TTCTTGTTGA ACATGTCCTC ACATTGATCA AACAATTTAA AGATAAGAAG AAAAAGAATC 1500 

AGAAGAAACC TGAAAATGAT GATGATGTGG AGATTAAGAA GCAGTTGTCC AAGTATGAAT 1560 

CTCAACTTTC AACAAATGAG GAGAAAGTAG ATACAGATGA TCGAACTGAA GGCTATTTAC 1620 

GAGCAGACTC ACAAGAGCCC TCCCACTTTG ATTCTCAGCA GCCTGCAGTC TTGGAAGAAG 1680 

AAGAGGTCAT GATAGCTCAT GCTCATCCAC AGGAAGTCTA CAATGAATAT GTACCCAGAG 1740 

GGTGCAAGAA TAAATGCCAT TCACATTTCC ACGATACACT CGGCCAGTCA GACGATCTCA 1800 

TTCACCACCA TCATGACTAC CATCATATTC TCCATCATCA CCACCACCAA AACCACCATC 1860 

CTCACAGTCA CAGCCAGCGC TACTCTCGGG AGGAGCTGAA AGATGCCGGC GTCGCCACTT 1920 

TGGCCTGGAT GGTGATAATG GGTGATGGCC TGCACAATTT CAGCGATGGC CTAGCAATTG 1980 

GTGCTGCTTT TACTGAAGGC TTATCAAGTG GTTTAAGTAC TTCTGTTGCT GTGTTCTGTC 2040 

ATGAGTTGCC TCATGAATTA GGTGACTTTG CTGTTCTACT AAAGGCTGGC ATGACCGTTA 2100 

AGCAGGCTGT CCTTTATAAT GCATTGTCAG CCATGCTGGC GTATCTTGGA ATGGCAACAG 2160 

GAATTTTCAT TGGTCATTAT GCTGAAAATG TTTCTATGTG GATATTTGCA CTTACTGCTG 2220 

GCTTATTCAT GTATGTTGCT CTGGTTGATA TGGTACCTGA AATGCTGCAC AATGATGCTA 2280 

GTGACCATGG ATGTAGCCGC TGGGGGTATT TCTTTTTACA GAATGCTGGG ATGCTTTTGG 2340 

GTTTTGGAAT TATGTTACTT ATTTCCATAT TTGAACATAA AATCGTGTTT CGTATAAATT 2400 

TCTAGTTAAG GTTTAAATGC TAGAGTAGCT TAAAAAGTTG TCATAGTTTC AGTAGGTCAT 2460 

AGGGAGATGA GTTTGTATGC TGTACTATGC AGCGTTTAAA GTTAGTGGGT TTTGTGATTT 2520 

TTGTATTGAA TATTGCTGTC TGTTACAAAG TCAGTTAAAG GTACGTTTTA ATATTTAAGT 2580 

TATTCTATCT TGGAGATAAA ATCTGTATGT GCAATTCACC GGTATTACCA GTTTATTATG 2640 

TAAACAAGAG ATTTGGCATG ACATGTTCTG TATGTTTCAG GGAAAAATGT CTTTAATGCT 2700 

TTTTCAAGAA CTAACACAGT TATTCCTATA CTGGATTTTA GGTCTCTGAA GAACTGCTGG 2760 

TGTTTAGGAA TAAGAATGTG CATGAAGCCT AAAATACCAA GAAAGCTTAT ACTGAATTTA 2820 

AGCAAAGAAA TAAAGGAGAA AAGAGAAGAA TCTGAGAATT GGGGAGGCAT AGATTCTTAT 2880 

AAAAATCACA AAATTTGTTG TAAATTAGAG GGGAGAAATT TAGAATTAAG TATAAAAAGG 2940 

CAGAATTAGT ATAGAGTACA TTCATTAAAC ATTTTTGTCA GGATTATTTC CCGTAAAAAC 3000 

GTAGTGAGCA CTCTCATATA CTAATTAGTG TACATTTAAC TTTGTATAAT ACAGAAATCT 3060 

AAATATATTT AATGAATTCA AGCAATATAC ACTTGACCAA GAAATTGGAA TTTCAAAATG 3120 

TTCGTGCGGG TTATATACCA GATGAGTACA GTGAGTAGTT TATGTATCAC CAGACTGGGT 3180 

TATTGCCAAG TTATATATCA CCAAAAGCTG TATGACTGGA TGTTCTGGTT ACCTGGTTTA 3240 

CAAAATTATC AGAGTAGTAA AACTTTGATA TATATGAGGA TATTAAAACT ACACTAAGTA 3300 

TCATTTGATT CGATTCAGAA AGTACTTTGA TATCTCTCAG TGCTTCAGTG CTATCATTGT 3360 

GAGCAATTGT CTTTATATAC GGTACTGTAG CCATACTAGG CCTGTCTGTG GCATTCTCTA 3420 
GATGTTTCTT TTTTACACAA TAAATTCCTT ATATCAGCTT G 



SEQ ID NO:181 BCR4 PROTEIN SEQUENCE 
Prolan Accession #: KP_036451 



1 11 21 31 41 51 

I I I I I I 

HARKLSVILI LTFALSVTNP LHELKAAAFP QTTEKISFNW ESGINVDLAI .STRQYHLQQL 60 

FYRYGENNSL SVEGFRKLLQ NIGIDK1KRI HIHHDHDHHS DHEHHSDHER HSDHEHHSDH 120 

EHHSDHDHHS HHNHAASGKN KRKALCPDHD SDSSGKDPRN SQGKGAHRPE HASGRRNVKD 180 

SVSASEVTST VYNTVSEGTH FLETIETPRP GKLPPKDVSS STPPSVTSKS RVSRLAGRKT 240 

NESVSEPRKG FMYSRNTNEN FQECFNASKL LTSHGMGIQV PLNATEFNYL CPAIINQIDA 300 

RSCLIHTSEK KAEIPPKTYS LQIAWVGGFI AISIISFLSL LGVILVPLMN RVFPKFLLSF 360 

LVALAVGTLS GDAFLHLLPH SHASHHHSHS HEEPAMEMKR GPLPSHLSSQ NIEESAYFDS 420 

TWKGLTALGG LYFMFLVEHV LTLIKQFKDX KKKNQKKPEN DDDVEIKKQL SKYESQLSTO 480 

EEKVDTDDRT EGYLRADSQE PSHFDSQQPA VLEEEEVHIA HAHPQEVYNE YVPRGCKNKC 540 

HSHFHDTLGQ SDDLIHHHHD YHHILHKHKU QNKHPHSHSQ RYSREELKDA GVATLAWMVI 600 

MGDGLHNFSD GLAIGAAPTE GLSSGLSTSV AVFCBELPHE LGDFAVLLKA GMTVKQAVLY 660 

NALSAHLAYXr GMATGIPIGH YAENVSMWIF ALTAGLFMYV ALVDMVPEML HNDASDHGCS 720 
RWGYFFLQNA GMLLGFGIHL LISIFEHKIV FRINF 



375 
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SEQ ID NO:182 SCY2 PNA S gQAi gnc e, 
Nucleic Add Accession #: 
Coding sequexe: 



NM.001203 

274-1782 (underlined sequences correspond to start and stop codons) 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 

60 

65 
70 
75 



I It 21 31 41 51 
I I I I I I 

CGCGGGGCGC GGAGTCGGCG GGGCCTCGCG GGACGCGGGC AGTGCCGAGA CCGCGGCGCT 60 
GAGGACGCGG GAGCCGGGAG CGCACGCGCG GGGTGGAGTT CAGCCTACTC TTTCTTAGAT 120 
GTGAAAGGAA AGGAAGATCA TTTCATGCCT TGTTGATAAA GGTTCAGACT TCTGCTGATT 180 
CATAACCATT TGGCTCTGAG CTATGACAAG AGAGGAAACA AAAAGTTAAA CTTACAAGCC 240 
TGCCATAAGT GAGAAGCAAA CTTCCTTGAT AACATGCTTT TGCGAAGTGC AGGAAAATTA 300 
AATGTGGGCA CCAAGAAAG A GGATGGTGAG AGTACAGCCC CCACCCCCCG TCCAAAGGTC 360 
TTGCGTTGTA AATGCCACCA CCATTGTCCA GAAGACTCAG TCAACAATAT TTGCAGCACA 420 
GACGGATATT GTTTCACGAT GATAGAAGAG C ATGACTCTG GGTTGCCTGT GGTCACTTCT 480 
GGTTGCCTAG GACTAGAAGG CTCAGATTTT CAGTGTCGGG ACACTCCCAT TCCTC ATCAA 540 
AG AAGATCAA TTGAATGCTG CACAGAAAGG AACGAATGTA ATAAAGACCT ACACCCTACA 600 
CTGCCT(XAT TG AAAAACAG AGATTTTGTT GATGGACCTA TACACCACAG GGCTTTACTT 660 
ATATCTGTGA CTGTCTGTAG TTTGCTCTTG GTCCTTATCA TATTATTTTG TTACTTCCGG 720 
TATAAAAGAC AAG AAACCAG ACCTCGATAC AGCATTGGGT TAG AACAGGA TGAAACTTAC 780 
ATTCCTCCTG GAGAATCCCT G AGAGACTTA ATTGAGCAGT CTCAGAGCTC AGGAAGTGGA 840 
TCAGGCCTCC CTCTGCTGGT CCAAAGGACT ATAGCTAAGC AGATTCAGAT GGTGAAACAG 900 
ATTGGAAAAG GTCGCTATGG GGAAGTTTGG ATGGGAAAGT GGCGTGGCGA AAAGGTAGCT 960 
GTGAAAGTGT TCTTCACCAC AGAGGAAGCC AGCTGGTTCA GAGAGACAGA AATATATCAG 1020 
ACAGTGTTGA TGAGGCATGA AAACATTTTG GGTTTCATTG CTGCAGATAT CAAAGGGACA 1080 
GGGTCCTGGA CCCAGTTGTA CCTAATCACA GACTATCATG AAAATGGTTC CCTTTATGAT 1140 
TATCTGAAGT CCACCACCCT AGACGCTAAA TCAATGCTGA AGTTAGCCTA CTCTTCTGTC 1200 
AGTGGCTTAT GTCATTTACA CACAGAAATC TTTAGTACTC AAGGCAAACC AGCAATTGCC 1260 
CATCGAGATC TGAAAAGTAA AAACATTCTG GTGAAGAAAA ATGGAACTTG CTGTATTGCT 1320 
GACCTGGGCC TGGCTGTTA A ATTTATTAGT GATACAAATG AAGTTGACAT ACCACCTAAC 1380 
ACTCGAGTTG GCACCAAACG CTATATGCCT CCAGAAGTGT TGG ACGAGAG CTTGAACAGA 1440 
A ATCACTTCC AGTCTTACAT CATGGCTGAC ATGTATAGTT TTGGCCTCAT CCTTTGGG AG 1500 
GTTGCTAGGA GATGTGTATC AGGAGGTATA GTGGAAGAAT ACCAGCTTCC TTATCATGAC 1560 
CTAGTGCCCA GTGACCCCTC TTATGAGGAC ATGAGGGAGA TTGTGTGCAT CAAGAAGTTA 1620 
CGCCCCTCAT TCCCAAACCG GTGGAGCAGT GATGAGTGTC TAAGGCAGAT GGGAAAACTC 1680 
ATGACAGAAT GCTGGGCTCA CAATCCTGCA TCAAGGCTGA CAGCCCTGCG GGTTAAGAAA 1740 
ACACTTGCCA AAATGTCAGA GTCCCAGGAC ATTAAACT CT GAT AGGAGAG GAAAAGTAAG 1800 
CATCTCTGCA GAAAGCCAAC AGGTACTCTT CTGTTTGTGG GCAGAGCAAA AGACATCAAA I860 
TAAGCATCCA CAGTACAAGC CTTGAACATC GTCCTGCTTC CCAGTGGGTT CAGACCTCAC 1920 
CTTTCAGGGA GCGACCTGGG CAAAGACAGA GAAGCTCCCA GAAGGAGAGA TTGATCCGTG 1980 
TCTGTTTGTA GGCGGAGAAA CCGTTGGGTA ACTTGTTCAA GATATGATGC AT 

SEQ ID NO:183 BCY2 Protein sequence 

Protein Accession #: NP.001194 



1 11 21 31 41 51 
I I I I I I 

MLLRS AGKLN VGTKXEDGES TAPTPRPKVL RCKCHHHCPE DSVNNICSTD GYCFTMIEED 60 
DSGLPVVTSG CLGLEGSDFQ CRDTPIPHQR RSIECCTERN ECNKDLHPTL PPLKNRDFVD 120 
GPIHHRAJLLI SVTVCSLLLV LULFCYFRY KRQETRPRYS IGLEQDETYI PPGESLRDU 180 
EQSQSSGSGS GLPLLVQRTI AKQIQMVKQI GKGRYGEVWM GKWRGEKVAV K VFFTTEE AS 240 
WFRETEIYQT VLMRHENDJG F1AADDCGTG SWTQLY1JTD YHENGSLYDY LKSTTLDAKS 300 
MLKLAYSS VS GLCHLHTEIF STQGKPAIAH RDLKSKNH.V KKNGTCOAD LGLAVKFISD 360 
TNEVDIPPNT RVGTKRYMPP EVLDESLNRN HFQSYIMADM YSFGULWEV ARRCVSGGIV 420 
EEYQLPYHDL VPSDPSYEDM REIVCIKKLR PSFPNRWSSD ECLRQMGKLM TBCWAHNPAS 480 
RLTALRVKKT LAKMSESQDI KL 



SEQ ID NO: 184 CBF9 DNA sequence 

Nucleic Add Accession #: ACO05383 

Coding Sequence: 328-2751 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 Si 



GACAGTGTTC GCGGCTGCAC CGCTCGGAGG CTGGGTGACC CGCGTAGAAG TGAAGTACTT 60 

TTTTATTTGC AGACCTGGGC CGATGCCGCT TTAAAAAACG CGAGGGGCTC TATGCACCTC 120 

CCTGGCGGTA GTTCCTCCGA CCTCAGCCGG GTCGGGTCGT GCCGCCCTCT CCCAGGAGAG 180 

ACAAACAGGT GTCCCACGTG GCAGCCGCGC CCCGGGCGCC CCTCCTGTGA TCCCGTAGCG 240 

CCCCCTGGCC CGAGCCGCGC CCGGGTCTGT GAGTAGAGCC GCCCGGGCAC CGAGCGCTGG 300 

TCGCCGCTCT CCTTCCGTTA TATCAACATG COCOC TT TCC TGTTGCTGGA GGCCGTCTGT 360 

GTTTTCCTGT TTTCCAGAGT GCCCCCATCT CTCCCTCTCC AGGAAGTCCA TGTAAGCAAA 420 

GAAACCATCG GGAAGATTTC AGCTGCCAGC AAAATGATGT GGTGCTCGGC TGCAGTGGAC 480 

ATCATGTTTC TGTTAGATGG GTCTAACAGC GTCGGGAAAG GGAGCTTTGA AAGGTCCAAG 540 

CACTTTGCCA TCACAGTCTG TGACGGTCTG GACATCAGCC CCGAGAGGGT CAGAGTGGGA 600 

GCATTCCAGT TCAGTTCCAC TCCTCATCTG GAATTCCCCT TGGATTCATT TTCAACCCAA 660 



376 
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CAGGAAGTGA AGGCAAGAAT CAAGAGGATG GTTTTCAAAG GAGGGCGCAC GGAGACGGAA 720 

CTTGCTCTGA AATACCTTCT GCACAGAGGG TTGCCTGGAG GCAGAAATGC TTCTGTGCCC 780 

CAGATCCTCA TCATCGTCAC TGATGGGAAG TCCCAGGGGG ATCTGGCACT GCCATCCAAG 840 

CAGCTGAAGG AAAGGGGTGT CACTGTGTTT GCTGTGGGGG TCAGGTTTCC CAGGTGGGAG 900 

5 GAGCTGCATG CACTGGCCAG CGAGCCTAGA GGGCAGCACG TGCTGTTGGC TGAGCAGGTG 960 

GAGGATGCCA CCAACGGCCT CTTCAGCACC CTCAGCAGCT CGGCCATCTG CTCCAGCGCC 1020 

ACGCCAGACT GCAGGGTCGA GGCTCACCCC TGTGAGCACA GGACGCTGGA GATGGTCCGG 1080 

GAGTTCGCTG GCAATGCCCC ATGCTGGAGA GGATCGCGGC GGACCCTTGC GGTGCTGGCT 1140 

GCACACTGTC CCTTCTACAG CTGGAAGAGA GTGTTCCTAA CCCACCCTGC CACCTGCTAC 1200 

.0 AGGACCACCT GCCCAGGCCC CTGTGACTCG CAGCCCTGCC AGAATGGAGG CACATGTGTT 1260 

CCAGAAGGAC TGGACGGCTA CCAGTGCCTC TGCCCGCTGG CCTTTGGAGG GGAGGCTAAC 1320 

TGTGCCCTGA AGCTGAGCCT GGAATGCAGG GTCGACCTCC TCTTCCTGCT GGACAGCTCT 1380 

GCGGGCACCA CTCTGGACGG CTTCCTGCGG GCCAAAGTCT TCGTGAAGCG GTTTGTGCGG 1440 

GCCGTGCTGA GCGAGGACTC TCGGGCCCGA GTGGGTGTGG CCACATACAG CAGGGAGCTG 1500 

.5 CTGGTGGCGG TGCCTGTGGG GGAGTACCAG GATGTGCCTG ACCTGGTCTG GAGCCTCGAT 1560 

GGCATTCCCT TCCGTGGTGG CCCCACCCTG ACGGGCAGTG CCTTGCGGCA GGCGGCAGAG 1620 

CGTGGCTTCG GGAGCGCCAC CAGGACAGGC CAGGACCGGC CACGTAGAGT GGTGGTTTTG 1680 

CTCACTGAGT CACACTCCGA GGATGAGGTT GCGGGCCCAG CGCGTCACGC AAGGGCGCGA 1740 

GAGCTGCTCC TGCTGGGTGT AGGCAGTGAG GCCGTGCGGG CAGAGCTGGA GGAGATCACA 1800 

10 GGCAGCCCAA AGCATGTGAT GGTCTACTCG GATCCTCAGG ATCTGTTCAA CCAAATCCCT 1860 

GAGCTGCAGG GGAAGCTGTG CAGCCGGCAG CGGCCAGGGT GCCGGACACA AGCCCTGGAC 1920 

CTCGTCTTCA TGTTGGACAC CTCTGCCTCA GTAGGGCCCG AGAATTTTGC TCAGATGCAG 1980 

AGCTTTGTGA GAAGCTGTGC CCTCCAGTTT GAGGTGAACC CTGACGTGAC ACAGGTCGGC 2040 

CTGGTGGTGT ATGGCAGCCA GGTGCAGACT GCCTTCGGGC TGGACACCAA ACCCACCCGG 2100 

15 GCTGCGATGC TGCGGGCCAT TAGCCAGGCC CCCTACCTAG GTGGGGTGGG CTCAGCCGGC 2160 

ACCGCCCTGC TGCACATCTA TGACAAAGTG ATGACCGTCC AGAGGGGTGC CCGGCCTGGT 2220 

GTCCCCAAAG CTGTGGTGGT GCTCACAGGC GGGAGAGGCG CAGAGGATGC AGCCGTTCCT 2280 

GCCCAGAAGC TGAGGAACAA TGGCATCTCT GTCTTGGTCG TGGGCGTGGG GCCTGTCCTA 2340 

AGTGAGGGTC TGCGGAGGCT TGCAGGTCCC CGGGATTCCC TGATCCACGT GGCAGCTTAC 2400 

50 GCCGACCTGC GGTACCACCA GGACGTGCTC ATTGAGTGGC TGTGTGGAGA AGCCAAGCAG 2460 

CCAGTCAACC TCTGCAAACC CAGCCCGTGC ATGAATGAGG GCAGCTGCGT CCTGCAGAAT 2520 

GGGAGCTACC GCTGCAAGTG TCGGGATGGC TGGGAGGGCC CCCACTGCGA GAACCGTGAG 2580 

TGGAGCTCTT GCTCTGTATG TGTGAGCCAG GGATGGATTC TTGAGACGCC CCTGAGGCAC 2640 

ATGGCTCCCG TGCAGGAGGG CAGCAGCCGT ACCCCTCCCA GCAACTACAG AGAAGGCCTG 2700 

55 GGCACTGAAA TGGTGCCTAC CTTCTGGAAT GTCTGTGCCC CAGGTCC TTA G AATGTCTGC 2760 

TTCCCGCCGT GGCCAGGACC ACTATTCTCA CTGAGGGAGG AGGATGTCCC AACTGCAGCC 2820 

ATGCTGCTTA GAGACAAGAA AGCAGCTGAT GTCACCCACA AACGATGTTG TTGAAAAGTT 2880 

TTGATGTGTA AGTAAATACC CACTTTCTGT ACCTGCTGTG CCTTGTTGAG GCTATGTCAT 2940 

CTGCCACCTT TCCCTTGAGG ATAAACAAGG GGTCCTGAAG ACTTAAATTT AGCGGCCTGA 3000 

tO CGTTCCTTTG CACACAATCA ATGCTCGCCA GAATGTTGTT GACACAGTAA TGCCCAGCAG 3060 

AGGCCTTTAC TAGAGCATCC TTTGGACGGC GAAGGCCACG GCCTTTCAAG ATGGAAAGCA 3120 

GCAGCTTTTC CACTTCCCCA GAGACATTCT GGATGCATTT GCATTGAGTC TGAAAGGGGG 3180 

CTTGAGGGAC GTTTGTGACT TCTTGGCGAC TGCCTTTTGT GTGTGGAAGA GACTTGGAAA 3240 

GGTCTCAGAC TGAATGTGAC CAATTAACCA GCTTGGTTGA TGATGGGGGA GGGGCTGAGT 3300 

15 TGTGCATGGG CCCAGGTCTG GAGGGCCACG TAAAATCGTT CTGAGTCGTG AGCAGTGTCC 3360 

ACCTTGAAGG TCTTC 

SEQ ID NO:185 C8F9 Protein sequence 
Protein Accession* none found 

50 

1 11 21 31 41 51 

I I I I I I 

MPPFLLLEAV CVFLFSRVPP SLPLQEVHVS KETIGKISAA SKMMWCSAAV DIMFLLDGSN 60 

>5 SVGKGSFERS KHFAITVCDG LDISPERVRV GAFQFSSTPH LEFPLDSFST QQEVKARIKR 120 

MVFKGGRTET ELALKYLLHR GLPGGRNASV PQILIIVTDG KSQGDVALPS KQLKERGVTV 180 

FAVGVRFPRW EELHALASEP RGQHVLLAEQ VEDATNGLFS TLSSSAICSS ATPDCRVEAH 240 

PCEHRTLEMV REFAGNAPCW RGSRRTLAVX. AAHCPFYSWK RVFLTHPATC YRTTCPGPCD 300 

SQPCQNGGTC VPEGLDGYQC LCPLAFGGEA NCALKLSLEC RVDLLFLLDS SAGTTLDGFL 360 

50 RAKVFVKRFV RAVLSEDSRA RVGVATYSRE LLVAVPVGEY QDVPDLVWSL DGIPFRGGPT 420 

LTGSALRQAA ERGFGSATRT GQDRPRRVW LLTESHSEDE VAGPARHARA RBLLLLGVGS -480 

EAVRAELEEI TGSPKHVMVY SDPQDLFNQI PELQGKLCSR QRPGCRTQAL DLVFMLDTSA 540 

SVGPENFAQM QSFVRSCALQ FEVNPDVTQV GLWYGSQVQ TAFGLDTKPT RAAMLRAI5Q 600 

APYLGGVGSA GTALLH1YDK VMTVQRGARP GVPKAVWLT GGRGAEDAAV PAQKLRNNGI 660 

65 SVLWGVGPV LSEGLRRLAG PRDSLIHVAA YADLRYHQDV LIEWLCGEAK QPVNLCKPSP 720 

CMNEGSCVLQ NGSYRCKCRD GWEGPHCENR EWSSCSVCVS QGWILETPLR HMAPVQEGSS 780 

RTPPSNYREG LGTEMVPTFW NVCAPGP 



70 SEQ ID NO:186 PAV1 DNA sequence 

Nucleic Acid Accession*: AF272890 ' 

Cooing Sequence: 87-1520 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

75 | | | | I I 

TGCTACCCGC GCCCGGGCTT CTGGGGTGTT CCCCAACCAC GGCCCAGCCC TGCCACACCC 60 
CCCGCCCCCG GCCTCCGCAG CTCGGCATGG GCGCGGGGGT GCTCGTCCTG GGCGCCTCCG 120 
AGCCCGGTAA CCTGTCGTCG GCCGCACCGC TCCCCGACGG CGCGGCCACC GCGGCGCGGC 180 
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TGCTGGTGCC CGCGTCGCCG CCCGCCTCGT TGCTGCCTCC CGCCAGCGAA AGCCCCGAGC 240 

CGCTGTCTCA GCAGTGGACA GCGGGCATGG GTCTGCTGAT GGCGCTCATC GTGCTGCTCA 300 

TCGTGGCGGG CAATGTGCTG GTGATCGTGG CCATCGCCAA GACGCCGCGG CTGCAGACGC 360 

TCACCAACCT CTTCATCATG TCCCTGGCCA GCGCCGACCT GGTCATGGGG CTGCTGGTGG 420 

TGCCGTTCGG GGCCACCATC GTGGTGTGGG GCCGCTGGGA GTACGGCTCC TTCTTCTGCG 480 

AGCTGTGGAC CTCAGTGGAC GTGCTGTGCG TGACGGCCAG CATCGAGACC CTGTGTGTCA 540 

TTGCCCTGGA CCGCTACCTC GCCATCACCT CGCCCTTCCG CTACCAGAGC CTGCTGACGC 600 

GCGCGCGGGC GCGGGGCCTC GTGTGCACCG TGTGGGCCAT CTCGGCCCTG GTGTCCTTCC 660 

TGCCCATCCT CATGCACTGG TGGCGGGCGG AGAGCGACGA GGCGCGCCGC TGCTACAACG 720 

ACCCCAAGTG CTGCGACTTC GTCACCAACC GGGCCTACGC CATCGCCTCG TCCGTAGTCT 780 

CCTTCTACGT GCCCCTGTGC ATCATGGCCT TCGTGTACCT GCGGGTGTTC CGCGAGGCCC 840 

AGAAGCAGGT GAAGAAGATC GACAGCTGCG AGCGCCGTTT CCTCGGCGGC CCAGCGCGGC 900 

CGCCCTCGCC CTCGCCCTCG CCCGTCCCCG CGCCCGCGCC GCCGCCCGGA CCCCCGCGCC 960 

CCGCCGCCGC CGCCGCCACC GCCCCGCTGG CCAACGGGCG TGCGGGTAAG CGGCGGCCCT 1020 

CGCGCCTCGT GGGCCTACGC GAGCAGAAGG CGCTCAAGAC GCTGGGCATC ATCATGGGCG 1080 

TCTTCACGCT CTGCTGGCTG CCCTTCTTCC TGGCCAACGT GGTGAAGGCC TTCCACCGCG 1140 

AGCTGGTGCC CGACCGCCTC TTCGTCTTCT TCAACTGGCT GGGCTACGCC AACTCGGCCT 1200 

TCAACCCCAT CATCTACTGC CGCAGCCCCG ACTTCCGCAA GGCCTTCCAG GGACTGCTCT 1260 

GCTGCGCGCG CAGGGCTGCC CGCCGGCGCC ACGCGACCCA CGGAGACCGG CCGCGCGCCT 1320 

CGGGCTGTCT GGCCCGGCCC GGACCCCCGC CATCGCCCGG GGCCGCCTCG GACGACGACG 1380 

ACGACGATGT CGTCGGGGCC ACGCCGCCCG CGCGCCTGCT GGAGCCCTGG GCCGGCTGCA 1440 

ACGGCGGGGC GGCGGCGGAC AGCGACTCGA GCCTGGACGA GCCGTGCCGC CCCGGCTTCG 1500 

CCTCGGAATC CAAGGTGTAG GGCCCGGCGC GGGGCGCGGA CTCCGGGCAC GGCTTCCCAG 1560 

GGGAACGAGG AGATCTGTGT TTACTTAAGA CCGATAGCAG GTGAACTCGA AGCCCACAAT 1620 

CCTCGTCTGA ATCATCCGAG GCAAAGAGAA AAGCCACGGA CCGTTGCACA AAAAGGAAAG 1680 

TTTGGGAAGG GATGGGAGAG TGGCTTGCTG ATGTTCCTTG TTG 



SEQ \D NO:187 PAV1 Protein sequence 
Protein Accession ft AA01 1 176 



1 11 21 31 41 51 

I ! I I I I 

MGAGVLVLGA SEPGNLSSAA PLPDGAATAA RLLVPASPPA SLLPPASESP EPLSQQWTAG 60 

MGLLMALIVL LIVAGNVLVI VAIAKTPRLQ TLTNLFIMSL ASADLVMGLL WPFGATIW 120 

WGRWEYGSFF CELWTSVDVL CVTASIETLC VIALDRYLAI TSPFRVQSLL TRARARGLVC 180 

TVWAISALVS FLPILMHWWR AESDEARRCY NDFKCCDFVT NRAYAIASSV VSFYVPLCIM 240 

AFVYLRVFRE AQKQVKKIDS CERRFLGGPA RPPSPSPSPV PAPAPPPGPP RPAAAAATAP 300 

LANGRAGKRR PSRLVALREQ KALKTLGIIM GVFTLCWLPF FLANWKAFH RELVPDRLFV 360 

FFNWLGYANS AFNPIIYCRS PDFRKAFOGL LCCARRAARR RHATHGDRPR ASGCLARPGP 420 
PPSPGAASDD DDDDWGATP PARLLEPWAG CNGGAAADSD SSLDEPCRPG FASESKV 



SEQ BP NO:188 BCOg DNA sequence 

Nucleic Add Accession ft AJ400877 

Coding sequence: 81-3080 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GGCGTCCGCG CACACCTCCC CGCGCCGCCG CCGCCACCGC CCGCACTCCG CCGCCTCTGC 60 
CCGCAACCGC TG AGCCATCC AJGGGGGTCG CGGGCCGCA A CCGTCCCGGG GCGGCCTGGG 1 20 
CGGTGCTGCT GCTGCTGCTG CTGCTGCCGC CACTGCTGCT GCTGGCGGGG GCCGTCCCGC 180 
CGGGTCGGGG CCGTGCCGCG GGGCCGCAGG AGGATGTAGA TGAGTGTGCC CAAGGGCTAG 240 
ATGACTGCCA TGCCGACGCC CTGTGTCAGA ACACACCCAC CTCCTACAAG TGCTCCTGCA 300 
AGCCTGGCTA CCAAGGGGAA GGCAGGCAGT GTGAGGACAT CGATGAATGT GGAAATGAGC 360 
TCAATGGAGG CTGTGTCCAT GACTGTTTG A ATATTCCAGG CAATTATCGT TGCACTTGTT 420 
TTGATGGCTT CATGTTGGCT CATGACGGTC ATAATTGTCT TGATGTGGAC GAGTGCCTGG 480 
AGAACAATGG CGGCTGCCAG CATACCTGTG TCA ACGTCAT GGGGAGCTAT GAGTGCTGCT 540 
GCAAGGAGGG GTTTTTCCTG AGTGACAATC AGCACACCTG CATTCACCGC TCGGAAGAGG 600 
GCCTGAGCTG CATGA ATAAG G ATCACGGCT GTAGTCACAT CTGCAAGGAG GCCCCA AGGG 660 
GCAGCGTCGC CTGTGAGTGC AGGCCTGGTT TTGAGCTGGC CAAGAACCAG AG AG ACTGCA 720 
TCTTGACCTG TAACCATGGC AACGGTGGGT GCCAGCACTC CTGTGACGAT ACAGCCGATG 780 
GCCCAGAGTG CAGCTGCCAT CCACAGTACA AGATGCACAC AGATGGGAGG AGCTGCCTTG 840 
AGCGAGAGGA CACTGTCCTG GAGGTGACAG AGAGCAACAC CACATCAGTG GTGGATGGGG 900 
ATAAACGGGT GAAACGGCGG CTGCTCATGG AAACGTGTGC TGTCAACAAT GGAGGCTGTG 960 
ACCGCACCTG TAAGGATACT TCGACAGGTG TCCACTGCAG TTGTCCTGTT GGATTCACTC 1020 
TCCAGTTGGA TGGGAAGACA TGTAAAGATA TTGATGAGTG CCAGACCCGC AATGGAGGTT 1080 
GTGATCATTT CTGCAAAAAC ATCGTGGGCA GTTTTGACTG CGGCTGCAAG AAAGGATTTA 1 140 
AATTATTAAC AGATGAGAAG TCTTGCCAAG ATGTGGATGA GTGCTCTTTG GATAGGACCT 1200 
GTGACCACAG CTGCATCAAC CACCCTGGC A CATTTGCTTG TGCTTGCAAC CGAGGGTACA 1 260 
CCCTGTATGG CTTCACCCAC TGTGGAGACA CCAATGAGTG CAGCATCAAC AACGGAGGCT 1320 
GTC AGCAGGT CTGTGTG AAC ACAGTGGGCA GCTATGAATG CCAGTGCCAC CCTGGGTACA 1 380 
AGCTCCACTG GAATAAAAAA GACTGTGTGG AAGTCAAGGG GCTCCTGCCC ACAAGTGTGT 1440 
CACCCCGTGT GTCCCTGCAC TGCGGTAAGA GTGGTGGAGG AGACGGGTGC TTCCTCAG AT 1 500 
GTCACTCTGG CATTCACCTC TCTTCAG ATG TCACCACCAT CAGGACAAGT GTAACCTTTA 1 560 
AGCTAAATGA AGGCAAGTGT AGTTTGAAAA ATGCTGAGCT GTTTCCCGAG GGTCTGCGAC 1620 
CAGCACTACC AGAGAAGCAC AGCTCAGTAA AAGAGAGCTT CCGCTACGTA AACCTTACAT 1680 
GCAGCTCTGG CAAGCAAGTC CCAGGAGCCC CTGGCCGACC AAGCACCCCT AAGGAAATGT 1740 
TTATCACTGT TGAGTTTGAG CTTGAAACTA ACCAAAAGGA GGTGACAGCT TCTTGTGACC 1800 
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TGACCTGCAT CGTAAAGCGA ACCGAGAAGC GGCTCCGTAA AGCCATCCGC ACGCTCAGAA 1860 
AGGCCGTCCA CAGGGAGCAG TTTCACCTCC AGCTCTCAGG CATGAACCTC GACGTGGCTA 1920 
A AAAGCCTCC CAG AACATCT GAACGCCAGG CAGAGTCCTG TGGAGTGGGC CAGGGTCATG 1980 
CAGAAAACCA ATGTGTCAGT TGCAGCCCTO GGACCTATTA TGATGGAGCA CGAG AACGCT 2040 
GCATTTTATG TCCAAATGGA ACCTTCCAAA ATGAGGAAGG ACAAATGACT TGTGAACCAT 2100 
GCCCAAGACC AGGAAATTCT GGGGCCCTGA AGACCCCAGA AGCTTGGAAT ATGTCTGAAT 2160 
GTGG AGGTCT GTGTCA ACCT GGTGAATATT CTGCAG ATGG CTTTGCACCT TGCCAGCTCT 2220 
GTGCCCTGGG CACGTTCCAG CCTG AAGCTG GTCGAACTTC CTGC7TCCCC TGTGGAGGAG 2280 
GCCTTGCCAC CAAACATCAG GG AGCTACTT CCTTTCAGG A CTGTGAAACC AGAGTTCAAT 2340 
GTTCACCTGG ACATTTCTAC AACACCACCA CTCACCG ATG TATTCGTTGC CCAGTGGGAA 2400 
CATACCAGCC TGAATTTGGA AA AAATAATT GTGTTTCTTG CCCAGG AA AT ACTACGACTG 2460 
ACTTTGATGG CTCCACAAAC ATAACCCAGT GTAAAAACAG AAGATGTGGA GGGGAGCTGG 2520 
GAG ATTTCAC TGGGTACATT G AATCCCCAA ACTACCCAGG CA ATTACCCA GCCAACACCG 2580 
AGTGTACGTG GACCATCAAC CCACCCCCCA AGCGCCGCAT CCTGATCGTG GTCCCTGAG A 2640 
TCTTCCTGCC CATAGAGGAC GACTGTGGGG ACTATCTGGT GATGCGG AAA ACCTCTTCAT 2700 
CCAATTCTGT GACAACATAT GAA ACCTGCC AG ACCTACG A ACGCCCCATC GCCTTCACCT 2760 
CCAGGTCAAA GAAGCTGTGG ATTCAGTTCA AGTCCAATGA AGGGAACAGC GCTAGAGGGT 2820 
TCCAGGTCCC ATACGTGACA TATGATGAGG ACTACCAGG A ACTCATTGAA GACATAGTTC 2880 
GAGATGGCAG GCTCTATGCA TCTGAGAACC ATCAGG A AAT ACTTAAGGAT AAGAAACTTA 2940 
TCAAGGCTCT GTTTGATGTC CTGGCCCATC CCCAGAACTA TTTCAAGTAC ACAGCCCAGG 3000 
AGTCCCGAGA GATGTTTCCA AGATCGTTCA TCCGATTGCT ACGTTCCAAA GTGTCCAGGT 3060 
TTTTGAGACC TTACAAAJGA CTCAGCCCAC GTGCCACTCA ATACAAATGT TCTGCTATAG 3120 
GGTTGGTGGG ACAGAGCTGT CTTCCTTCTG CATGTCAGCA CAGTCGGGTA TTGCTGCCTC 3180 
CCGTATCAGT GACTCATTAG AGTTCAATTT TTATAG ATAA TACAGATATT TTGGTAAATT 3240 
GAACTTGGTT TTTCTTTCCC AGCATCGTGG ATGTAGACTG AGAATGGCTTTGAGTGGCAT 3300 
CAGCTTCTCA CTGCTGTGGG CGGATGTCTT GGATAG ATCA CGGGCTGGCT GAGCTGGACT 3360 
TTGGTCAGCC TAGGTGAGAC TCACCTGTCC TTCTGGGGTC TTACTCCTCC TCAAGGAGTC 3420 
TGTAGTGGAA AGGAGGCCAC AGAATAAGCT GCTTATTCTG AAACTTCAGC TTCCTCTAGC 3480 
CCGGCCCTCT CTAAGGGAGC CCTCTGCACT CGTGTGCAGG CTCTGACCAG GCAGAACAGG 3540 
CAAGAGGGGA GGGAAGGAGA CCCCTGCAGG CTCCCTCCAC CCACCTTGAG ACCTGGG AGG 3600 
ACTCAGTTTC TCCACAGCCT TCTCCAGCCT GTGTGATACA AGTTTGATCC CAGGAACTTG 3660 
AGTTCTAAGC AGTGCTCGTG A A AAA AAA AA GCAGAAAGAA TTAGAAATAA ATAAAAACTA 3720 
AGCACTTCTG GAGACAT 

SEQ ID NO:189 BC02 Protein sequence 

Protein Accession* CAB92285 



1 11 21 31 41 51 
I I I I I I 

MGVAGRNRPG AAWAVLLUX LLPPLLLLAG A VPPGRGRAA GPQEDVDECA QGLDDCHADA 60 
LCQNTPTSYK CSCKPGVQGE GRQCEDIDEC GNELNGGCVH DCLNIPGNYR CTCFDGFMLA 120 
HDGHNCLDVD ECLENNGGCQ HTCVNVMGSY ECCCKEGFFL SDNQHTCIHR SEBGLSCMN K 1 80 
DHGCSHICKE APRGS VACEC RPGFELAKNQ RDCILTCNHG NGGCQHSCDD TADGPECSCH 240 
PQYKMHTDGR SCLEREDTVL EVTESNTTSV VDGDKRVKRR LLMETCAVNN GGCDRTCKDT 300 
STGVHCSCPV GFTLQLDGKT CKD1DECQTR NGGCDHFCKN IVGSFDCGCK KGFKLLTDEK 360 
SCQDVDECSL DRTCDHSCIN HPGTFACACN RGYTLYGFTH CGDTNECSIN NGGCQQVCVN 420 
TVGSYECQCH PGYKLHWNKK DCVEVKGLLP TSVSPRVSLH CGKSGGGDGC FLRCHSGIHL 480 
SSDVTTIRTS VTFKLNEGKC SLKNAELFPE GLRPALPEKH SSVKESFRYV NLTCSSGKQV 540 
PGAPGRPSTP KEMHTVEFE LETNQKEVTA SCDLSCIVKR TEKRLRKAIR TLRKAVHREQ 600 
FHLQLSGMNL DVAKKPPRTS ERQAESCGVG QGHAENQCVS CRAGTYYDGA RERCBLCPNG 660 
TPQNEEGQMT CEPCPRPGNS GALKTPEAWN MSECGGLCQP GEYS ADGFAP CQLCALGTFQ 720 
PEAGRTSCFP CGGGLATKHQ GATSFQDCET RVQCSPGHFY NTTTHRCIRC PVGTYQPEFG 780 
KNNCVSCPGN TTTDFDGSTN ITQCKNRRCG GELGDFTGYI ESPNYPGNYP ANTECTWTIN 840 
PPPKRRILIV VPEIFLPIED DCGDYLVMRK TSSSNSVTTY ETCQTYERPI AFTSRSKKLW 900 
IQFKSNEGNS ARGFQVPYVT YDEDYQELIE D1VRDGRLYA SENHQEILKD KKUKALFDV 960 
LAHPQNYFKY TAQESREMFP RSFIRULRSK VSRFLRPYK 

SEQ ID N0:190 BFG1 ONA seouence 

Nucleic Acid Accession #: AF007170 

Coding sequence: 1-1725 (underlined sequences correspond to stop codon) 

1 U 21 31 41 51 
I I I I t I 

AAGGAGGCGG CCTCCGGGAA AAGCG ACCGC AGGACTCCTG AGAGCAGCCT CCATGAGGCC 60 
CTGG ACCAGT GCATGACCGC CCTGG ACCTC TTCCTCACCA ACCAGTTCTC AGAAGCACTC 1 20 
AGCTACCTCA AGCCCAGAAC CAAGGAAAGC ATGTACCACT CACTGACATA TGCCACCATC 180 
CTGGAGATGC AGGCCATG AT GACCTTTGAC CCTCAGGACA TCCTGCTTGC CGGCAACATG 240 
ATG AAGGAGG CACAGATGCT GTGTCAGAGG CACCGGAGGA AGTCTTCTGT AACAGATTCC 300 
TTCAGCAGCC TGGTGAACCG CCCCACGCTG GGCCAATTCA CTG AAGAAGA AATCCACGCT 360 
GAGGTCTGCT ATCCAG AGTG CCTGCTGCAG CGAGCAGCCC TGACCTTCCT GCAGGACGAG 420 
AACATGGTGA GCTTCATCAA AGGCGGCATC AAAGTTCGAA ACAGCTACCA GACCTACAAG 480 
GAGCTGGACA GCCTTGTTCA GTCCTCACAA TACTGCAAGG GTGAGAACCA CCCGCACTTT 540 
GAAGGAGGAG TG AAGCTTGG TGTAGGGGCC TTCAACCTGA CACTGTCCAT GCTTCCTACT 600 
AGGATCCTGA GGCTGTTGGA GTTTGTGGGG TTTTCAGGAA ACAAGGACTA TGGGCTGCTG 660 
CAGCTGG AGG AGGG AGCGTC AGGGCACAGC TTCCGCTCTG TGCTCTGTGT CATGCTCCTG 720 
CTGTGCTACC ACACCTTCCT CACCTTCGTG CTCGGTACTG GGAACGTCAA CATCGAGG AG 780 
GCCGAGAAGC TCTTGAAGCC CTACCTGAAC CGGTACCCTA AGGGTGCCAT CTTCCTGTTC 840 
TTTGCAGGGA GGATTGAAGT CATTAAAGGC AACATTG ATG CAGCCATCCG GCGTTTCGAG 900 
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GAGTGCTGTG AGGCCCAGCA GCACTGGAAG CAGTTTCACC ACATGTGCTA CTGGGAGCTG 960 
ATGTGGTGCT TCACCTACAA GGGCCAGTGG AAGATGTCCT ACTTCTACGC CGACCTGCTC 1020 
AGCAAGGAGA ACTGCTGGTC CAAGGCCACC TACATTTACA TGAAGGCCGC CTACCTCAGC 1080 
ATGTTTGGGA AGGAGGACCA CAAGCCGTTC GGGGACGACG AAGTGGAATT ATTTCGAGCT 1 140 
GTGCCAGGCC TGAAGCTCAA GATTGCTGGG AAATCTCTAC CCACAGAGAA GTTTGCCATC 1200 
CGGAAGTCCC GGCGCTACTT CTCCTCCAAC CCTATCTCGC TGCCAGTGCC TGCTCTGGAA 1 260 
ATGATGTACA TCTGGAACGG CTACGCCGTG ATTGGGAAGC AGCCGA AACT CACGGATGGG 1 320 
ATACTTGAGA TTATCACTAA GGCTGAAGAG ATGCTGGAGA AAGGCCCAGA GAACGAGTAC 1380 
TCAGTGGATG ACGAGTGCTT GGTGAAATTG TTGAAAGGCC TGTGTCTGAA ATACCTGGGC 1440 
CGTGTCCAGG AGGCCGAGGA GAATTTTAGG AGCATCTCTG CCAATGAAAA GAAGATTAAA 1500 
TATGACCACT ACTTGATCCC AA ACGCCCTG CTGGAGCTGG CCCTGCTGCT TATGGAGCAA 1560 
GACAGAAACG AAG AGGCCAT CAAACTTTTG GAATCTGCCA AGCAAAACTA CAAGAATTAC 1620 
TCCATGGAGT CAAGGACACA CTTTCGAATC CAGGCAGCCA CACTCCAAGC CAAGTCTTCC 1680 
CTAGAGAACA GCAGCAGATC CATGGTCTCA TCAGTGTCCT TGTA£CTTTG TGCAGCAGTT 1740 
CCGGGCTGGA AG AC AG AG AC AGCTGGACAG AGCTCCTGAA AACATTTCAA AATACCCCCT 1800 
CCCCCTGCCC TGCCCTGCCT TTGGGGTCCA CCGGCACTCC AGTTGGATGG CACAACATAG 1860 
TGTATCCGTG CAGAAGCCGA GCTGGCATTT TCACCAGTGT AGCCAAGGGC CTTTGCCAAG 1 920 
GGCAGAGCAG GTGGAGCCCT CTGCCTGCCC TATCACACAT ACGGGTACTT GCTTTTCACT 1980 
GTGATGTTTA AGAGAATGTA TGAACAGTTT ACATTTTCCT TAGAAATACA TTGATGGG AT 2040 
CACAGTTGGC TTTAAAAACC AACA ACAATC AACCACCTGT AAGTCTTTGT CTTCACCTAT 2100 
TATCATCTGG AGGTAAATCT CTTTATATGA TGATGCCAAA GGGCAAATTG CTTTTCAAAT 2160 
TCAGCAAGTT CTCAGCTTGT GTGACGG AAG GTCCTTCAG A GGACCTGAGG AATGCCTGGG 2220 
AG AGGCTAAG CCTCAGGCTT CAATGCTTCT GGGGTTGGGC ATGAGGATGT ACACAGACAC 2280 
CCACTACCTT ACTACTCACA CTTCATTTCA CTCCTTTTGT A AATTTCCAA TTTAAAAATC 2340 
AAGCACGTCT TTTTAGTGAG ATAAAATCTG AGCTCTTCTG TAGAAAAATC AATCTCTACC 2400 
AGTAGAAAAT GCCAGGGCTT GATGG AAGAG CTGTGTAGCC CTTTCTATGC CAAAGCCAGG 2460 
AAATTTGGGG GGCAGGAGGA GGTTCTCAGA ATCCAGTCTG TATCTTTGCT GTATGCCAAA 2520 
CTGAAACCAC TGGG AATAA T TTAT G AAACA TAAAAATCTT CTGTACTTCA CTCCAAGGTA 2580 
CATTTGCTTA CTGACAGCAT TTTTGTTAA A ACTGTTATTC TTGAAAAAAA AAAAAAAAAA 2640 
AA 

SEQ IP N0:191 BFG1 Protein sequence 

Protein Accession #: AAC395B2 



1 11 21 31 41 51 
I I I I I I 

MTALDLFLTN QFSEALSYLK PRTKESMYHS LTYATILEMQ AMMTFDPQDI LLAGNMMKEA 60 
QMLCQRHRRK SS VTDSFSSL VNRPTLGQFT EEEIHAEVCY AECLLQRAAL TFLQDENM VS 120 
FIKGGIKVRN SYQTYKELDS LVQSSQYCKG ENHPHFEGGV KLGVGAFNLT LSMLPTRILR 180 
LLEFVGFSGN KDYGLLQLEE GASGHSFRSV LCVMLLLCYH TFLTFVLGTG NVNEEAEKL 240 
LKPYLNRYPK GAIFLFFAGR IEVIKGNIDA AIRRFEECCE AQQHWKQFHH MCYWELMWCF 300 
TYKGQWKMSY FYADLLSKEN CWSKATYIYM KAAYLSMFGK EDHKPFGDDE VELFRAVPGL 360 
KLKIAGKSLP TEKFAKKSR RYFSSNPISL PVPALEMMYI WNGYAVIGKQ PKLTDGILEI 420 
ITKAEEMLEK GPENEYS VDD ECLVKLLKGL CLKYLGRVQE AEENFRSIS A NEKKIKYDHY 480 
UPNAULELA LLLMEQDRNE EA1KLLESAK QNYKNYSMES RTHFRIQAAT LQAKSSLENS 540 
SRSMVSSVSL 



Nucleic Acid Accession!: NM.032583 

Coding sequence: 1-4044 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

ATGACTAGGA AGAGGACATA CTGGGTGCCC AACTCTTCTG GTGGCCTCGT G AATCGTGGC 60 
ATCG ACATAG GCG ATGACAT GGTTTCAGGA CTTATTTATA AAACCTATAC TCTCC AAG AT 1 20 
GGCCCCTGGA GTCAGCAAGA GAGAAATCCT GAGGCTCCAG GGAGGGCAGC TGTCCCACCG 180 
TGGGGGAAGT ATGATGCTGC CTTGAG AACC ATGATTCCCT TCCGTCCCAA GCCGAGGTTT 240 
CCTGCCCCCC AGCCCCTGG A CAATGCTGGC CTGTTCTCCT ACCTCACCGT GTCATGGCTC 300 
ACCCCGCTCA TGATCCAAAG CTTACGGAGT CGCTTAGATG AGAACACCAT CCCTCCACTG 360 
TCAGTCCATG ATGCCTCAGA CAAAAATGTC CAAAGGCTTC ACCGCCTTTG GGAAGAAGAA 420 
GTCTCAAGGC GAGGGATTGA AAAAGCTTCA GTGCTTCTGG TGATGCTGAG GTTCCAGAGA 480 
ACAAGGTTGA TTTTCGATGC ACTTCTGGGC ATCTGCTTCT GCATTGCCAG TGTACTCGGG 540 
CCAATATTGA TTATACCAAA GATCCTGGAA TATTCAGAAG AGCAGTTGGG GAATGTTGTC 600 
CATGG AGTGG GACTCTGCTT TGCCCTTTTT CTCTCCGAAT GTGTGAAGTC TCTGAGTTTC 660 
TCCTC CAGT T GG ATCATCA A CCAACGCACA GCCATCAGGT TCCGAGCAGC TGTTTCCTCC 720 
TTTGCCTTTG AG AAGCTCAT CCAATTTAAG TCTGTAATAC ACATCACCTC AGGAGAGGCC 780 
ATCAGCTTCT TCACCGGTGA TGTAAACTAC CTGTTTGAAG GGGTGTGCTA TGGACCCCTA 840 
GTACTGATC A CCTGCGCATC GCTGGTCATC TGCAGCATTT CTTCCTACTT CATTATTGG A 900 
TACACTGCAT TTATTGCCAT CTTATGCTAT CTCCTGGTTT TCCCACTGGC GGTATTCATG 960 
AGAAGAATGG CTGTGAAGGC TCAGCATCAC ACATCTGAGG TCAGCGACCA GCGCATCCGT 1020 
GTGACCAGTG AAGTTCTCAC TTGCATTAAG CTGATTAAAA TGTACACATG GGAGAAACCA 1080 
TTTGCAAAAA TCATTGAAGG TATGGAAAGT CTGACTTTCT GCTCCAAACC TGGTGATGGC 1 140 
ATGGCCTTCA GCATGCTGGC CTCCTTGAAT CTCCTTCGGC TGTCAGTGTT CTTTGTGCCT 1200 
ATTGCAGTCA AAGGTCTCAC GAATTCCAAG TCTGCAGTGA TGAGGTTCAA GAAGTTTTTC 1260 
CTCCAGGAGA GCCCTGTTTT CTATGTCCAG ACATTACAAG ACCCCAGCAA AGCTCTGGTC 1320 
TTTGAGGAGG CCACCTTGTC ATGGCAACAG ACCTGTCCCG GGATCGTCAA TGGGGCACTG 1380 
GAGCTGGAGA GGAACGGGCA TGCTTCTGAG GGGATGACCA GGCCTAGAGA TGCCCTCGGG 1440 
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CCAGAGGAAG AAGGGAACAG CCTGGGCCCA GAGTTGCACA AG ATCAACCT GGTGGTGTCC 1500 
AAGGGGATGA TGTTAGGGGT CTGCGGCAAC ACGGGGAGTG GTAAGAGCAG CCTGTTGTCA 1560 
GCCATCCTGG AGGAGATGCA CITGCTCGAG GGCTCGGTGG GGGTGCAGGG AAGCCTGGCC 1620 
TATGTCCCCC AGCAGGCCTG GATCGTCAGC GGGAACATCA GGGAGAACAT CCTCATGGGA 1680 
GGCGCATATG ACAAGGCCCG ATACCTCCAG GTGCTCCACT GCTGCTCCCT GAATCGGGAC 1740 
CTGGAACTTC TGCCCTTTGG AG ACATG ACA GAGATTGGAG AGCGGGGCCT CAACCTCTCT 1 800 
GGGGGGCAG A AACAGAGG AT CAGCCTGGCC CGCGCCGTCT ATTCCG ACCG TCAG ATCTAC 1 860 
CTGCTGGACG ACCCCCTGTC TGCTGTGG AC GCCCACGTGG GGAAGCACAT TTTTGAGGAG 1 920 
TGCATTAAGA AGACACTCAG GGGGAAGACG GTCGTCCTGG TGACCCACCA GCTGCAGTAC 1980 
TTAGAATTTT GTGGCCAG AT CATTTTGTTG GAAAATGGGA AAATCTGTGA AAATGGAACT 2040 
CACAGTG AGT TAATGCAG A A A A AGGGG AAA TATGCCCAAC TTATCCAGAA GATGCACAAG 2 1 00 
GAAGCCACTT CGGACATGTT GCAGGACACA GCAAAGATAG CAGAGAAGCC AAAGGTAGAA 2160 
AGTCAGGCTC TGGCCACXTC CCTGGAAGAG TCTCTCAACG GAAATGCTGT GCCGGAGCAT 2220 
CAGCTCACAC AGGAGG AGG A GATGGAAG AA GGCTCCTTGA GTTGGAGGGT CTACCACCAC 2280 
TACATCCAGG CAGCTGGAGG TTACATGGTC TCTTGCATAA TTTTCTTCTT CGTGGTGCTG 2340 
ATCGTCTTCT TAACG ATCTT CAGCTTCTGG TGGCTGAGCT ACTGGTTGG A GCAGGGCTCG 2400 
GGGACCAATA GCAGCCGAGA GAGCAATGGA ACCATGGCAG ACCTGGGCAA CATTGCAGAC 2460 
AATCCTCAAC TGTCCTTCTA CCAGCTGGTG TACGGGCTCA ACGCCCTGCT CCTCATCTGT 2520 
GTGGGGGTCT GCTCCTCAGG GATTTTCACC AAAGTCACGA GGAAGGCATC CACGGCCCTG 2580 
CACA ACAAGC TCTTCAACAA GGTTTTCCGC TGCCCCATGA GTTTCTTTGA CACCATCCCA 2640 
ATAGGCCGGC TTTTGAACTG CTTCGCAGGG GACTTGGAAC AGCTGGACCA GCTCTTGCCC 2700 
ATCTTTTCAG AGCAGTTCCT GGTCCTGTCC TTAATGGTGA TCGCCGTCCT GTTGATTGTC 2760 
AGTGTGCTGT CTCCATATAT CCTGTTAATG GGAGCCATAA TCATGGTTAT TTGCTTCATT 2820 
TATTATATG A TGTTCAAGAA GGCCATCGGT GTGTTCAAGA GACTGGAGAA CTATAGCCGG 2880 
TCTCCTTTAT TCTCCCACAT CCTCAATTCT CTGCAAGGCC TG AGCTCCAT CCATGTCTAT 2940 
GGAAAAACTG AAG ACTTCAT CAGCCAGTTT AAG AGGCTG A CTG ATGCGCA GAATAACTAC 3000 
CTGCTGTTGT TTCTATCTTC CACACGATGG ATGGCATTGA GGCTGGAGAT CATGACCAAC 3060 
CTTGTGACCT TGGCTGTTGC CCTGTTCGTG GCTTTTGGCA TTTCCTCCAC CCCCTACTCC 3 1 20 
TTTAAAGTCA TGGCTGTCAA CATCGTGCTG CAGCTGGCGT CCAGCTTCCA GGCCACTGCC 3180 
CGGATTGGCT TGGAGACAGA GGCACAGTTC ACGGCTGTAG AG AGGATACT GCAGTACATG 3240 
AAGATGTGTG TCTCGGAAGC TCCTTTACAC ATGGAAGGCA CAAGTTGTCC CCAGGGGTGG 3300 
CCACAGCATG GGGAAATCAT ATTTCAGGAT TATCACATGA AATACAGAGA CAACACACCC 3360 
ACCGTGCTTC ACGGCATCAA CCTGACCATC CGCGGCCACG A AGTGGTGGG CATCGTGGG A 3420 
AGGACGGGCT CTGGGAAGTC CTCCTTGGGC ATGGCTCTCT TCCGCCTGGT GGAGCCCATG 3480 
GCAGGCCGGA TTCTCATTGA CGGCGTGGAC ATTTGCAGCA TCGGCCTGGA GGACTTGCGG 3540 
TCCAAGCTCT CAGTGATCCC TCAAGATCCA GTGCTGCTCT CAGGAACCAT C AGATTCAAC 3600 
CTAGATCCCT TTGACCGTCA CACTGACCAG CAG ATCTGGG ATGCCTTGGA GAGG ACATTC 3660 
CTGACCAAGG CCATCTCAAA GTTCCCCAAA AAGCTGCATA CAGATGTGGT GGAAAACGGT 3720 
GGAAACTTCT CTGTGGGGGA GAGGCAGCTG CTCTGCATTG CCAGGGCTGT GCTTCGCAAC 3780 
TCCAAGATCA TCCTTATCGA TGAAGCCACA GCCTCCATTG ACATGGAGAC AGACACCCTG 3840 
ATCCAGCGCA CAATCCGTG A AGCCTTCCAG GGCTGCACCG TGCTCGTCAT TGCCCACCGT 3900 
GTCACCACTG TGCTGAACTG TG ACCAC ATC CTGGTTATGG GCAATGGGAA GGTGGTAGAA 3960 
TTTGATCGGC CGG AGGTACT GCGGAAGAAG CCTGGGTCAT TGTTCGCAGC CCTCATGGCC 4020 
ACAGCCACTT CTTCACTGAG ATAAGGAGAT GTGGAG ACTT CATGGAGGCT GGCAGCTGAG 4080 
CTCAGAGGTT CACACAGGTG CAGCTTCGAG GCCCACAGTC TGCGACCTTC TTGTTTGGAG 4140 
ATGAGAACTT CTCCTGG AAG CAGGGGTAAA TGTAGGGGGG GTGGGGATTG CTGGATGGAA 4200 
ACCCTGGAAT AGGCTACTTG ATGGCTCTCA AGACCTTAGA ACCCCAGAAC CATCTAAG AC 4260 
ATGGG ATTCA GTGATCATGT GGTTCTCCTT TTAACTTACA TGCTGAATAA TTTTATAATA 4320 
AGGTAAAAGC TTATAGTTTT CTGATCTGTG TTAGA AGTGY TGCAAATGCT GTACTG ACTT 4380 
TGTAAAATAT AAAACTAAGG AAAACTCAAA AAAAAAAAAA AAAAAAA 

IP NQ;1 W PFW Prtifl" SMV 
Proieto Accession fc NP.115972.1 

1 11 21 31 41 51 
I I I I I I 

MTRKRTYWVP NSSGGLVNRG IDIGDDMVSG UYKTYTLQD GPWSQQERNP EAPGRAA VPP 60 
WGKYDAALRT MIPFRPKPRF PAPQPLDNAG LFSYLTVSWL TPLMIQSLRS RLDENTIPPL 120 
SVHDASDKN V QRLHRLWEEE VSRRGIEKAS VLLVMLRFQR TRL1FDAIXG ICFCIAS VLG 1 80 
PIUIPKILE YSEEQLGNW HGVGLCFALF LSECVKSLSF SSSWHNQRT AIRFRAAVSS .240 
FAFEKLIQFK SVIHITSGEA ISFFTGDVNY LFEGVCYGPL VUTCASLVI CSISSYFIIG 300 
YTAFIAHjCY LLVFPLA VFM TRMAVKAQHH TSEVSDQRIR VTSEVLTCIK UKMYTWEKP 360 
FAKJIEGMES LTFCSKPGDG MAFSMLASLN LLRLSVFFVP 1AVKGLTNSK SAVMRFKKFF 420 
LQESPVFYVQ TLQDPSKALV FEEATLSWQQ TCPGPVNGAL ELERNGHASE GMTRPRDALG 480 
PEEEGNSLGP ELHKINLWS KGMMLG VCGN TGSGKSSLLS AILEEMHLLE GSVGVQGSLA 540 
YVPQQAWIVS GNIREN1LMG GAYDKARYLQ VLHCCSLNRD LELLPFGDMT EIGERGLNLS 600 
GGQKQRISLA RAVYSDRQIY LLDDPLSAVD AHVGKHIFEE CIKKTLRGKT WLVTHQLQY 660 
LEFCGQIILL ENGKJCENGT HSELMQKKGK YAQUQKMHK EATSDMLQDT AK1AEKPKVE 720 
SQALATSLEE SLNGNAVPEH QLTQEEEMEE GSLSWRVYHH YIQAAGGYMV SCnFFFWL 780 
rVFLTIFSFW WLSYWLEQGS GTNSSRESNG TMADLGN1AD NPQLSFYQLV YGLNA1XUC 840 
VGVCSSGEFT KVTRKASTAL HNKLFNKVFR CPMSFFDTIP 1GRIXNCFAG DLEQLDQLLP 900 
IFSEQFLVLS LMVLAVLLTV SVLSPYILLM GAOMVICFI YYMMFKKAIG VFKRLENYSR 960 
SPLFSHILNS LQGLSSIHVY GKTEDF1SQF KRLTDAQNNY LLLFLSSTRW MALRLEIMTN 1020 
LVTLAVALFV AFGISSTPYS FKVMAVNIVL QLASSFQATA RlGLETEAQF TAVERILQYM 1080 
KMCVSEAPLH MEGTSCPQGW PQHGEHFQD YHMKYRDNTP TVLHGINLTI RGHEVVGIVG 1 140 
RTGSGKSSLG MALFRLVEPM AGRIUDGVD ICSIGLEDLR SKLSVIPQDP VLLSGTIRFN 1200 
LDPFDRHTDQ QIWDALERTF LTKAISKFPK KLHTDV VENG GNFSVGERQL LCIARA VLRN 1 260 
SK1ILIDEAT ASIDMETDTL IQRTIREAFQ GCTVLVIAHR VTTVLNCDHI LVMGNGKWE 1320 
FDRPEVLRKK PGSLFAALMA TATSSLR 
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SEQ ID NO:194 BHB6 PNA sequence 

Nude* Add Accession I: AA9832S1 

Coding sequence: 1-1749 (undefined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

ATGCTGTCTG GCTTCTTGAT GAGTCCCAGT ACCCAGCACA GAGCACAGTA CACTCCCGGA 60 

GGAAAGAAAC TTCCGTGGGA GGCTTCCATC GGTGCGCACA CCTCCCGAGG GCGAGGCAGC 120 

GACCGGGAGA GGGAGAGCCG GCCGGAGGCT GCCGGGCTCC TGTGGGACCG CGCTGCAGCC 180 

GGGGAGGCGG AGAAGGGGAA CCGGGGCGAG CCGCCCGCCT GGATCCGCGC CCAGCAGCAG 240 

CCGCGGCCGC CGCCAGCTGG GCAGGCTCCC GGGACTGCGG CTGGGGGCGC GCAGGACCCT 300 

CGCCTGCGTC CTGGACGTTC CCGGGGGAGG GTCCGGTTGC CAGTGAAACC TCCAGAGGCT 360 

TCCGGACGAC AGCCCCGGGG GCCTTCTGAC TGCATCCCGA GATTTCCATC AGCGAGTGCA 420 

ACTCATAAGG CAGTCCCTAA GGGGACCGGG CCACCGGCTG AGGACGGGGA TGGCTTAGGA 480 

GCTCCTGGAC CTAGGGCCCG GCGTCGTCGC CTCCTGGGCG TCGCGGCAGA GGGGAGTGGC 540 

CCGCGCGGAA AGCGCCGCGG GACAGTCAGT GACGAGGCCC GGGGGTCGCC GGGGCCACGA 600 

CTTCTCGGAG ACCGTCCTGC GCTCTCTGGA GACGCGCTGT CCGCGCCCAG GGTGGTGCCA 660 

TGTGGGGCGC TCGCCGCTCG TCCGTCTCCT CATCCTGGAA CGCCGCTTCG CTCCTGCAGC 720 

TGCTGCTGGC TGCGCTGCTG GCGGCGGGGG CGAGGGCCCA GCGGCGAGTA CTGCCACGGC 780 

TGGCTGGACG CGCAGGGCGT CTGGCGCATC GGCTTCCAGT GTCCCGAGCG CTTCGACGGC 840 

GGCGACGCCA CCATCTGCTG CGGCAGCTGC GCGTTGCGCT ACTGCTGCTC CAGCGCCGAG 900 

GCGCGCCTGG ACCAGGGCGG CTGCGACAAT GACCGCCAGC AGGGCGCTGG CGAGCCTGGC 960 

CGGGCGGACA AAGACGGGCC CCGACGGCTC GGCAGGGCTT CATGTCTTAG GGGTACCCAA 1020 

GGAGACGGCG AGGGTGCGCC CCCACCCGTG AGGGCCTGGC AGCGGTGCTC CCCTGAAGGC 1080 

TCCCCGAAAG GAAGGCAGCT CCTCAGGGCT TTCCCGGGGC TGCTGCCCCG TGCCAGACGC 1140 

CGCGGATTCC CATCTTCTCC ACGCGGCGGC CCCTCTCCCC TGCAGCGGCC CGCCTTGCCC 1200 

ATCTACGTGC CGTTCCTCAT TGTTGGCTCC GTGTTTGTCG CCTTTATCAT CTTGGGGTCC 1260 

CTGGTGGCAG CCTGTTGCTG CAGATGTCTC CGGCCTAAGC AGGATCCCCA GCAGAGCCGA 1320 

GCCCCAGGGG GTAACCGCTT GATGGAGACC ATCCCCATGA TCCCCAGTGC CAGCACCTCC 1380 

CGGGGGTCGT CCTCACGCCA GTCCAGCACA GCTGCCAGTT CCAGCTCCAG CGCCAACTCC 1440 

GGGGCCCGGG CGCCCCCAAC AAGGTCACAG ACCAACTGTT GCTTGCCGGA AGGGACCATG 1500 

AACAACGTGT ATGTCAACAT GCCCACGAAT TTCTCTGTGC TGAACTGTCA GCAGGCCACC 1560 

CAGATTGTGC CACATCAAGG GCAGTATCTG CATCCCCCAT ACGTGGGGTA CACGGTGCAG 1620 

CACGACTCTG TGCCCATGAC AGCTGTGCCA CCTTTCATGG ACGGCCTGCA GCCTGGCTAC 1680 

AGGCAGATTC AGTCCCCCTT CCCTCACACC AACAGTGAAC AGAAGATGTA CCCAGCGGTG 1740 

ACTGTATAAC CGAGAGTCAC TGGTGGGTTC CTTTACTGAA GGGAGACGAA GGCAGGGGTG 1800 

GATTCTCGAG GTGGAAGTCC GCACATGTCG GTGGTATTTA TGGCACGATT CCTTTGGATG 1860 

GCTTCATTTG CCCCCAGACT GTATGAAAAC ATCTCCGAAT TAGCATTTCT GGATATGTTT 1920 

CATCCAGGGT ATCATTGATT TATGATGGAA AACCGGCCTC AGCTCGAGAT GACTGTGATG 1980 

TTGCTGATGG GTGTATAACA AATGCTTGAG TCCGAAGTGC CCTTGAGATA TGGTTGACGA 2040 

AAGAATTTTA TAAACTGATA AATTAAGGAT TTTTATTATG TTGTTATTAT TATTTCTTTT 2100 

TTGTTGTTGA CTGCACAGGA TCAAAATGCC TGTTATCTCC CTTTTACTGG GACTTTTTTT 2160 

TTTTTTTTTT TTTTTTTTAA TCAGACAGGG TCTTGCTCTG TTGCCCAGGC TGGAGTGCAG 2220 

TGGTGCGATC TCGGCTCACT GCAACTTCAG CCTCCTGGAT TCAGGCAACA CTCCTGCCTC 2280 

AGCCTCCCAC GTGGCTGGGA TTACAGGTGC CTGCCCCCAT GGCTAATTTT TTGTATTTTT 2340 

TGTAGAGATG GGGTTTCACC ATGTTGGCTG GGCTGGTCTC ACTCTCCTGA CCTCAAGCAA 2400 

TCTGCCTGTC TCAGCCTCCC AAAGTGCTGG GATTACAGGC GTGAGCCACC GCCCCCAGCC 2460 

TGAGCCTTTT TTTTTTTCTA ATGCATCCAA GGTTAAGGGG AAGACGCAAA TAACAGGACT 2520 

ATTCTAAAAG GAAACCTGTT TGAACTCTGT GAGATCAGTC ATCAGTCTCA GTATTCCACA 2580 

GGCACACCTT AATTTCATTG TAAAAAGATA TATATATTTT GTCTATTTTT GTGCTTTTGG 2640 

GGGCCTATTT TGTGCTTTTT TACCTTATGT AGAGATCTTA TTACAAAGTG ATTTTCTACA 2700 

TTAAAAAGAG ACTGAAATAA ATTGTATAGT TACTTAACTA ATGAAGACAT TTCAGAACTC 2760 

TGGGATGATT TTAATCTTGA AGTAGTAGGT GGTATAGTCA TAAAACCATT CATCCCCTTC 2820 

TTGATTGTAT CTTAATTTTC TGGCTTTAAG GTGACATCTG AGAGGTAATG CATTCTTTTT 2880 

TATATTGAAA TCATAAACTA TCACCCGCTG CTTCTCTGAG TTACTTTTAA TTTT GC CT TC 2940 

TGGTTATGGT TTGGCGTTTC CTTCTGTTTG GTTTTCAGAG CCCCATGTCT ATATAGTCCT 3000 

GAGTGCAAGT AATTACTATA CTTGTAAATG AAGATCAGTA TTTCTGCCTA GATCTGATAA 3060 

AAAAATTTTC TTGTCTTAGT TATAAAAATT CAAAGAAATG TGTTACAAAG ATACTTAGTA 3120 

TAGCTCCTCA GCCATAACCT GAGACTTGGG ATGAAATTTA AACCAGATAC GATTTACTTY 3180 

GCAGATCATA AGGCTTTTTA TACTCTTGTT ATCAAAATCG CTTATTTTTC AGGCACTAAG 3240 

GATTGTTAAG AGAAAAGCTT TTCAACGAAG GATTGCCTTT CTTCTCCCAC ACTGTTCTTG 3300 

ATTTCCTCTC TCTTTCAGGC CTCAACAGGC ACTGTATTCA TTGCCAATGT TCCAAATTAT 3360 

CAAATTCAAG TGAATTTATT TGTGTGTTCT TTACTTATAT AAAAAAAGAT AACTTTAAGG 3420 

ATGTGCAAGT ACATTTCCAA CTGCTAGCAC AACCAGTATT TTGTAATTAA ACAAATCGCT 3480 

GTATGGTATG GTCTTCTACA CATTTATGTC TATAGATATC TATCGATCAT CTTTCTATTC 3540 

TGTTTCATGA CTGAATAATG TAAAACCAGT GTTGGCAATT GGTATCATCA ATGATACTCA 3600 

TTTTTTAATA ACCAAAGGCA GGGGAAAATC ATTTTACTTA TTAATAAATA TTTTATGATG 3660 
TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
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PCT/US0 1/32045 



SEQ |D Hfrlft gHB8 PjQl eto s equence 

Protein Accession #: none found 



i 
I 

HLSGFLMSPS 
GEAEKGNRGE 
SGRQPRGPSD 
PRGKRRGTVS 
CCWLRCWRRG 
ARLDQGGCDN 
SPKGRQLLRA 
LVAACCCRCL 
GARAPPTRSQ 
HDSVPMTAVP 



11 
I 

TQHRAQYTPG 
PPAWIRAQQQ 
CIPRFPSASA 
DEARGSPGPR 
RGPSGEYCHG 
DRQOGAGEPG 
FPGLLPRARR 
RPKQDPQQSR 
TNCCLPEGTM 
PFHDGLQPGY 



21 
I 

GKKLPWEASI 
PRPPPAGQAP 
THKAVPKGTG 
LLGDRPALSG 
WLDAOGVWRI 
RADKDGPRRL 
RGFPSSPRGG 
APGGNRLMET 
NNWVNMPTN 
RQIQSPFPHT 



31 
I 

GAHTSRGRGS 
GTAAGGAQDP 
PPAEDGDGLG 
DALSAPRWP 
GFCCPERFDG 
GRASCLRGTQ 
PSPLORPALP 
IPMIPSASTS 
FSVLNCQQAT 
NSEQKMYPAV 



41 
I 

DRERESRPEA 
RLRPGRSRGR 
APGPRARRRR 
CGALAARPSP 
GBATICCGSC 
GDGEGAPPPV 
IYVPFLIVGS 
RGSSSRQSST 
QIVPHQGQYL 
TV 



51 
I 

AGLLWDRAAA 
VRLPVKPPEA 
LLGVAAEGSG 
HPGTPLRSCS 
ALRYCCSSAE 
RAWQRCSPEG 
VFVAFIILGS 
AASSSSSANS 
HPPYVGYTVQ 



60 
120 
180 
240 
300 
360 
420 
480 
540 



NudeJc Acid Accession #: 



SEQ ID NO:t96 CQA5 DNA SEQUENCE 



AA0884S8 
Coding sequence: 



862-1995 (underlined sequences correspond to start and stop codons) 



GCCCTTGGAC 
CTGAAGAAAA 
GCGCGGGGCC 
CTGGGCCAGA 
CGGCTACTGC 
TGTGCCAGCC 
ACCTCACCCC 
CTCACCCAGG 
GCGCTCATTA 
GATTCCACCT 
AGCCCTTCGA 
GCCCAGGCAC 
GCCTGCCCCC 
ACATGGGCTG 
TGGACAGTGG 
GGTCCCATCT 
AGAGGGCGCG 
CAGGACGAGG 
GTAAGCGGGG 
CTGGCCAAGG 
GGCCTGCATG 
TTGCCCCACG 
GACAGCTCCC 
CTGGGGTCCT 
GAGAGGCCAC 
GGCAGGTCCC 
GGAGTACGCA 
GAACCAGGGG 
TCAGTGTGTG 
CCCGATGCGG 
ACACTGTCCC 
CCTTCCGGAG 
TGCTGCACCT 
GCCCTCCTAC 
ACCTCCTGGG 
AGGTGGACTG 
GGCTGGGGTC 
TGGGGGATCC 
GGTGACTTCA 
GAGACAGGCT 
AAAGAAATAG 
CACGAGGGGA 
GCAGACCCTG 
GAGCAGCGTC 
GCGTGCACAC 
CAGAAGTGTC 
TTTTGTGTTG 
CTGGAATCCC 
CCCCATCTCT 
TCATAAACAC 
TAGACCCAGA 
AGAAATAAAA 



11 
I 

ACTGACATGG 
AGGAGCTGGA 
GCGACTGGTA 
GCAGAGCCAG 
CCAAGGTACA 
GGGCCCTGCC 
CGGTCTGGCA 
AGGTGACCGA 
AGCAGCTGTT 
TCATCTAGTC 
GGGTGGGCGC 
AGTCCCGGAG 
GGCTGGTCCC 
GGGGCTCTCT 
GGTACCCCTC 
TCAGGGAAAG 
GGGCGGCTCC 
TGGCTGTAGC 
GGTGCCTGCC 
CTGAGGGACC 
TGCCTCCCAC 
TTGAGTCCCA 
AGGCACGTCA 
GCTCACCCCC 
CTCCCTCAGC 
CTTGGGTGTC 
CTGGTGGGGG 
CACGGCAACA 
TGGGGCGCAG 
GGTCAGTGCG 
ACAAGGCACC 
CCCAGCTCCA 
GGTCTGCAGG 
CCTGAAGATG 
CAGGAAAGGG 
CAGCGCAGTG 
TGCCCACCAG 
TGGCATCTTT 
TCAGGAGACC 
GGCACCTCCG 
GTCCTCCCAG 
GAATTTAAAG 
CCTGGAGCCT 
CCTGGGCTCT 
TGTGATGACA 
CCCAGTTGAG 
ATCAAGTTCC 
AGCACTTGAG 
ACAARAAAAA 
CACAAGGAAA 
TACTAGAATT 
GAGATTTCTG 




CAAGGAAAAC 
ACTCCCTCAG 
GGCCCTGCTC 
GCATCGATGG 
GGCCTCCGAT 
TGGGGGGCGC 
TGTCTCAGAG 
TGCTAACCTG 
GGTGTCCCAG 
GGAGTGGGCT 
TGCAGGTCCT 
GGTGGGCCAG 
GGCCTCCCCA 
ACTGGACTGG 
GCCCACATAG 
GAAAAACTGC 
TTTACAGCTT 
GCCCCGGCTG 
GCCCTAGGAC 
ATCCGCGAGG 
CCCGGAAATG 
AATCTGCCCC 
AAGGAAAAGG 
GCCAGGAGTT 
AAAAAGAAAG 
CAATACACTA 
ATCAGAGAGA 
GAAACATGAA 



TCCGGGCCCC 
ATCCTCATGC 
CGCATCACGC 
GCCCTGAGCC 
GCGTGGGCCC 
CCACCCTCTC 
CCTGCCGCCC 
GCTTGACTCC 
TAGTCCGCAG 
CGTCCCCCCG 
CGCCAGGCTG 
CAAGGGCAGC 
GGAAGTAGAT 
GCCCCAGGGA 
CGGATCGGCA 
GTGATGGCCT 
TGTGAGCCTG 
CTGTTTCCCC 
ACGCCCAGCC 
GAGAACCCCC 
CCCCTGCCCA 
AGCCCAACCT 
GTTCTGCAGC 
GCGGGGTCAG 
AGGGCCCCCT 
GAGGGGCCCT 
CCCACAGCAA 
GACAGGCCCA 
TTCCAGGGGA 
GAGGGCCTGT 
TGGCAGCCAG 
CGTCTGCCTT 
AAGCAGGAGA 
AGCTGGACCC 
CTTTCAGCCT 
GAAATCAGGC 
GCAGGGTCTA 
GCTGGGCGGG 
TGCCAGTAGC 
TCTCAGGATG 
AGAGGAACAC 
AACATCTCAG 
CCAGAGCAGC 
AAAGAAAATG 
TGAGACCCAG 
ATATAAAGTA 
AAAAAA 



41 
I 

CACGAGGACA 
TGCAGGGTTT 
TGCAGGAGCG 
GGAGCCCCCG 
GGGAGCTGCT 
CCTGCCCTGC 
TGAAGGAGCA 
AGCTGGAGCA 
AGCAGGACGG 
CCAGGGCCAG 
TGGCTGGAGA 
TTGCCAGATG 
GTTTKGGCTC 
CTACTACTGG 
TTTCCAGCGG 
CACTTCCAAC 
TTCCCGCTCA 
GGAGGGGGTG 
TAGCGGTCGG 
CGCCGGGTGG 
TCCCCCTCTT 
GCTCCCCAGG 
CGACTCAGGA 
TGTCCCCAGG 
AGGGTACAGG 
GGCCCACTCC 
GGAGGGTCCC 
CCAGGGCCCC 
TGCGTGGGGG 
CGTGTCCAGG 
GGCAGGCAGC 
CCCCACAGAG 
AGTCAGCCCA 
CATAAGGATG 
GCCCCACAGC 
GGAGAAGCCC 
TGAGGGTGCC 
CAGAACAGTG 
CGCAGCTGAA 
TGGTGTTOCG 
TAGTGAGTGG 
GGTGGCTGGC 
TCAGTCTCCG 
GTGTGCAGGT 
TTGAAATGTG 
ACCCACACCA 
CCGGGCGTGG 
CTGGGCAACG 
AGAGATCCAG 
CAGAAGCAAC 
ACAGTGTTTT 



51 
I 

CTGACATGGA 
GGAGATGATG 
CCAGCGCCGC 
CCCACTGGGG 
GGCTGCAGCC 
CCTGACGTCC 
GAACCGACTC 
GGAGAAGTCG 
GGGACCTCTG 
CCTGGCACTC 
CCCCCGGCAG 
GGCTCCCCAG 
CTGGTTGYTG 
CCGCTGTCAG 
TCCCGCCCTG 
AACGGGCAGC 
ACCAGGGCAC 
GGGACGGCCT 
ACTTCAGGTT 
GCGAGAGCTT 
GGCCGGGACG 
AGGGCCCCCA 
TTTCCAAGGC 
TTTCAGCTGG 
AGGAGGCTGG 
CGCTGGTGCT 
AGTGTCACCA 
CGATGCGGGG 
GCGCAGGGCC 
GCACTTTGGT 
GTGGCAACTC 
CCACATTCCC 
GCATGCAGCT 
TCAGGCCTGG 
CCCAGCACCC 
CCCGTCAGCA 
TGCCATGCCC 
TCTGTCCCGG 
GCGGAAATGT 
TGCAAGGTGA 
CCCTGGAGAC 
AGAGGCACAT 
TGCAGGATGT 
ACATACACGT 
TCCTTGGGGG 
GGCCTCAGGA 
TGGTTCACGC 
CAGTGAGAGA 
GTTTAAAAAT 
AGATTGACTC 
ATATATCTAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
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SEQ 10 KO:197 LBG2 DNA SEQUENCE 

Nucleic Add Accession #: X63629 

Coding sequence: 54-2543 {start and stop codons are undefined) 

1 U 21 31 41 51 
I I I i I I 

GCGGAACACC GGCCCGCCGT CGCGGCAGCT GCTTCACCCC TCTCTCTGCA GCCATQGGGC 60 
TCCCTCGTGG ACCTCTCGCG TCTCTCCTCC TTCTCCAGGT TTGCTGGCTG CAGTGCGCGG 120 
CCTCCGAGCC GTGCCGGGCG GTCTTCAGGG AGGCTGAAGT GACCTTGGAG GCGGGAGGCG 180 
CGGAGCAGGA GCCCGGCCAG GCGCTGGGGA AAGTATTCAT GGGCTGCCCT GGGCAAGAGC 240 
CAGCTCTGTT TAGCACTG AT AATG ATGACT TCACTGTGCG GAATGGCGAG ACAGTCCAGG 300 
AAAGAAGGTC ACTGAAGGAA AGGAATCCAT TGAAGATCTT CCCATCCAAA CGTATCTTAC 360 
GAAG AC ACAA G AG AG ATTGG GTGGTTGCTC CAATATCTGT CCCTGAAAAT GGCAAGGGTC 420 
CCTTCCCCCA GAGACTGAAT CAGCTCAAGT CTAATAAAGA TAGAGACACC AAG ATTTTCT 480 
ACAGCATCAC GGGGCCGGGG GCAGACAGCC CCCCTGAGGG TGTCTTCGCT GTAGAGA AGG 540 
AGACAGGCTG GTTGTTGTTG AATAAGCCAC TGGACCGGGA GGAGATTGCC AAGTATGAGC 600 
TCTTTGGCCA CGCTGTGTCA GAGAATGGTG CCTCAGTGGA GGACCCCATG AACATCTCCA 660 
TCATCGTG AC CGACCAGAAT GACCACAAGC CCAAGTTTAC CCAGGACACC TTCCGAGGGA 720 
GTGTCTTAGA GGGAGTCCTA CCAGGTACTT CTGTGATGCA GGTGACAGCC ACAG ATG AGG 780 
ATGATGCCAT CTACACCTAC AATGGGGTGG TTGCTTACTC CATCCATAGC CAAGAACCAA 840 
AGGACCCACA CGACCTCATG TTCACAATTC ACCGGAGCAC AGGCACCATC AGCGTCATCT 900 
CCAGTGGCCT GGACCGGGAA AAAGTCCCTG AGTACACACT GACCATCCAG GCCACAGACA 960 
TGGATGGGGA CGGCTCCACC ACCACGGCAG TGGCAGTAGT GGAGATCCTT G ATGCCAATG 1020 
ACAATGCTCC CATGTTTGAC CCCCAGAAGT ACGAGGCCCA TGTGCCTGAG AATGCAGTGG 1080 
GCCATGAGGT GCAGAGGCTG ACGGTCACTG ATCTGGACGC CCCCAACTCA CCAGCGTGGC 1 140 
GTGCCACCTA CCTTATCATG GGCGGTGACG ACGGGGACCA TTTTACCATC ACCACCCACC 1200 
CTGAGAGCAA CCAGGGCATC CTGACAACCA GGAAGGGTTT GG ATTTTG AG GCCAAAAACC 1260 
AGCACACCCT GTACGTTGAA GTG ACCAACG AGGCCCCTTT TGTGCTG AAG CTCCCAACCT 1 320 
CCACAGCCAC CATAGTGGTC CACGTGGAGG ATG TG AATG A GGCACCTGTG TTTGTCCCAC 1380 
CCTCCAAAGT CGTTG AGGTC CAGGAGGGCA TCCCCACTGG GGAGCCTGTG TGTGTCTACA 1440 
CTGCAGAAGA CCCTGACAAG GAGAATCAAA AGATCAGCTA CCGCATCCTG AGAGACCCAG 1500 
CAGGGTGGCT AGCCATGGAC CCAGACAGTG GGCAGGTCAC AGCTGTGGGC ACCCTCGACC 1560 
GTGAGGATGA GCAGTTTGTG AGG AACAACA TCTATGAAGT CATGGTCTTG GCCATGGACA 1620 
ATGGAAGCCC TCCCACCACT GGCACGGGAA CCCTTCTGCT AACACTGATT GATGTCAACG 1680 
ACCATGGCCC AGTCCCTGAG CCCCGTCAGA TCACCATCTG CAACCAAAGC CCTGTGCGCC 1740 
ACGTGCTGAA CATC ACGGAC AAGGACCTGT CTCCCCACAC CTCCCCTTTC CAGGCCCAGC 1 800 
TCACAGATGA CTCAGACATC TACTGG ACGG CAGAGGTCAA CG AGGAAGGT GACACAGTGG 1 860 
TCTTGTCCCT GAAGAAGTTC CTGAAGCAGG ATACATATGA CGTGCACCTT TCTCTGTCTG 1920 
ACCATGGCAA CAAAGAGCAG CTGACGGTGA TCAGGGCCAC TGTGTGCGAC TGCCATGGCC 1980 
ATGTCGAAAC CTGCCCTGGA CCCTGGAAAG GAGGTTTCAT CCTCCCTGTG CTGGGGGCTG 2040 
TCCTGGCTCT GCTGTTCCTC CTGCTGGTGC TGCTTTTGTT GGTG AG AAAG AAGCGGAAGA 2100 
TCAAGGAGCC CCTCCTACTC CCAGAAGATG ACACCCGTGA CAACGTCTTC TACTATGGCG 2160 
AAGAGGGGGG TGGCGAAGAG GACCAGGACT ATGACATCAC CCAGCTCCAC CGAGGTCTGG 2220 
AGGCCAGGCC GGAGGTGGTT CTCCGCAATG ACGTGGCACC AACCATCATC CCGACACCCA 2280 
TGTACCGTCC TAGGCCAGCC AACCCAGATG AAATCGGCAA CTTTATAATT GAG AACCTGA 2340 
AGGCGGCTAA CACAGACCCC ACAGCCCCGC CCTACGACAC CCTCTTGGTG TTCGACTATG 2400 
AGGGCAGCGG CTCCG ACGCC GCGTCCCTG A GCTCCCTC AC CTCCTCCGCC TCCG ACCAAG 2460 
ACCAAGATTA CGATTATCTG AACGAGTGGG GCAGCCGCTT CAAGAAGCTG GCAGACATGT 2520 
ACGGTGGCGG GGAGGACGAC TAGGCGGCCT GCCTGCAGGG CTGGGGACCA AACGTCAGGC 2580 
CACAGAGCAT CTCCAAGGGG TCTCAGTTCC CCCTTCAGCT GAGGACTTCG GAGCTTGTCA 2640 
GGAAGTGGCC GTAGCAACTT GGCGG AGACA GGCTATG AGT CTG ACGTTAG AGTGGTTGCT 2700 
TCCTTAGCCT TTCAGG ATGG AGGAATGTGG GCAGTTTGAC TTCAGCACTG AAAACCTCTC 2760 
CACCTGGGCC AGGGTTGCCT CAG AGGCCAA GTTTCCAGAA GCCTCTTACC TGCCGTAAAA 2820 
TGCTCAACCC TGTGTCCTGG GCCTGGGCCT GCTGTGACTG ACCTACAGTG GACTTTCTCT 2880 
CTGGAATGGA ACCTTCTTAG GCCTCCTGGT GCAACTTAAT TTTTTTTTTT AATGCTATCT 2940 
TCAAAACGTT AG AGAAAGTT CTTCAAAAGT GC AGCCCAG A GCTGCTGGGC CCACTGGCCG 3000 
TCCTGCATTT CTGGTTTCCA GACCCCAATG CCTCCCATTC GGATGGATCT CTGCGTTTTT 3060 
ATACTGAGTG TGCCTAGGTT GCCCCTTATT TTTTATTTTC CCTGTTGCGT TGCTATAGAT 3120 
GAAGGGTGAG GACAATCGTG TATATGTACT AGAACTTTTT TATTAAAGAA A 



$EQ |D NQ:t98lB(?2 Protein swgnce; 

Protein Accession #: CAA45177 



1 11 21 31 41 51 

MGLPRGPLAS uiljQVCWLQ CAASEPCRAV FREAEVTLEA GGAEQEPGQA LGKVFMGCPG 60 
QEPALFSTDN DDFTVRNGET VQERRSLKER NPLKDFPSKR ILRRHKRDWV VAPISVPENG 120 
KGPFPQRLNQ LKSNKDRDTK IFYSITGPGA DSPPEG VFA V EKETGWLLLN KPLDREHAK 1 80 
YELFGHAVSE NGASVEDPMN 1SIIVTDQND HKPKFTQDTF RGSVLEGVLP GTSVMQVTAT 240 
DEDD AJYTYN G WAYSIHSQ EPKDPHDLMF TTHRSTGTTS V1SSGLDREK VPEYTLTIQA 300 
TDMDGDGSTT TAVAWEILD ANDNAPMFDP QKYEAHVPEN AVGHEVQRLT VTDLDAPNSP 360 
AWRATYUMG GDDGDHFnT THPESNQGIL TTRKGLDFEA KNQHTLY VEV TNEAPFVLKL 420 
PTSTATTWH VEDVNEAPVF VPPSKVVEVQ EGIPTGEPVC VYTAEDPDKE NQKJSYR1LR 480 
DPAGWLAMDP DSGQVTA VGT LDREDEQFVR NNIYEVMVLA MDNGSPPTTG TGTLLLTUD 540 
VNDHGPVPEP RQIT1CNQSP VRHVLNITDK DLSPHTSPFQ AQLTDDSDIY WTAEVNEEGD 600 
TWLSLKKFL KQDTYDVHLS LSDHGNKEQL TVIRATVCDC HGHVETCPGP WKGGFILPVL 660 
GAVLALLFLL LVLLLLVRKK RKDCEPLLLP EDDTRDNVFY YGEEGGGEED QDYD1TQLHR 720 
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CLEARPEWL RNDVAPTDP TPMYRPRPAN PDEIGNFUE NLKAANTDPT APPYDTLLVF 780 
DYEGSGSDAA SLSSLTSSAS DQDQDYDYLN EWGSRFKKLA DMYGGCEDD 

SEO ID NO:1S9 Ofl t5 DNA SEQUENCE 

Nucleic Add Accession* NM.012152 

Coding sequence: 43-1 104 {underlined sequences correspond to start and slop codons) 



1 11 21 31 41 51 

I I I I I I 

CTTCTTTAAA TTTCTTTCTA GGATGTTCAC TTCTTCTCCA CAATGAATGA GTGTCACTAT 60 

GACAAGCACA TGGACTTTTT TTATAATAGG AGCAACACTG ATACTGTCGA TGACTGGACA 120 

GGAACAAAGC TTGTGATTGT TTTGTGTGTT GGGACGTTTT TCTGCCTGTT TATTTTTTTT 180 

TCTAATTCTC TGGTCATCGC GGCAGTGATC AAAAACAGAA AATTTCATTT CCCCTTCTAC 240 

TACCTGTTGG CTAATTTAGC TGCTGCCGAT TTCTTCGCTG GAATTGCCTA TGTATTCCTG 300 

ATGTTTAACA CAGGCCCAGT TTCAAAAACT TTGACTGTCA ACCGCTGGTT TCTCCGTCAG 360 

GGGCTTCTGG ACAGTAGCTT GACTGCTTCC CTCACCAACT TGCTGGTTAT CGCCGTGGAG 420 

AGGCACATGT CAATCATGAG GATGCGGGTC CATAGCAACC TGACCAAAAA GAGGGTGACA 480 

CTGCTCATTT TGCTTGTCTG GGCCATCGCC ATTTTTATGG GGGCGGTCCC CACACTGGGC 540 

TGGAATTGCC TCTGCAACAT CTCTGCCTGC TCTTCCCTGG CCCCCATTTA CAGCAGGAGT 600 

TACCTTGTTT TCTGGACAGT GTCCAACCTC ATGGCCTTCC TCATCATGGT TGTGGTGTAC 660 

CTGCGGATCT ACGTGTACGT CAAGAGGAAA ACCAACGTCT TGTCTCCGCA TACAAGTGGG 720 

TCCATCAGCC GCCGGAGGAC ACCCATGAAG CTAATGAAGA CGGTGATGAC TGTCTTAGGG 780 

GCGTTTGTGG TATGCTGGAC CCCGGGCCTG GTGGTTCTGC TCCTCGACGG CCTGAACTGC 840 

AGGCAGTGTG GCGTGCAGCA TGTGAAAAGG TGGTTCCTGC TGCTGGCGCT GCTCAACTCC 900 

GTCGTGAACC CCATCATCTA CTCCTACAAG GACGAGGACA TGTATGGCAC CATGAAGAAG 960 

ATGATCTGCT GCTTCTCTCA GGAGAACCCA GAGAGGCGTC CCTCTCGCAT CCCCTCCACA 1020 

GTCCTCAGCA GGAGTGACAC AGGCAGCCAG TACATAGAGG ATAGTATTAG CCAAGGTGCA 1080 

GTCTGCAATA AAAGCACTTC CTAAACTCTG GATGCCTCTC GGCCCACCCA GGTGATGACT 1140 
GTCTTAGG 



SEQ ID NO:200 0BI5 Protein sequence: 

Protein Accession #: NP.03S284 

1 11 21 31 41 51 

I I I I I I 

MNECHYDKHH DFFYNRSNTD TVDDWTGTKL VIVLCVGTFF CLFIFFSNSL VIAAVIKNRK 60 
FHFPFYYLLA NLAAADFFAG IAYVFLMFNT GPVSKTLTVN RWFLRQGLLD SSLTASLTNL 120 
LVIAVERHMS IMRMRVHSNL TKKRVTLLIL LVWAIAIFKG AVPTLGWNCL CNI SACS SLA 180 
PIYSRSYLVF WTVSNLMAFL IMWVYLRIY VYVKRKTNVL SPHTSGSISR RRTPMKLMKT 240 
VMTVLGAFW CWTPGLWLL LDGLNCRQCG VQHVKRWFLL LALLNSWNP IIYSYKDEDM 300 
YGTMKKMICC FSQENPERRP SRIPSTVLSR SDTGSQYIED SISQGAVCNK STS 

SEQ ID NCfc201 PAA6 DNA SEQUENCE 

Nucleic Add Accession #: AA569531 

Coding sequence: 1-504 (undefined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I i 1 t I 

ATGACCTACA GTTACTCATT TTTCAGGCCT GAGTTGATCG TTAATCATCT TAATTATGTT 60 

CATTCTGAAG CCAACAGGAG AACCAAGACC AAAACTTTAT TGTCTCTGCT TTCATTTCTT 120 

GATGAAACCT CTGGACTAAG CACACATCTT CCTTGTTTAT CTCTCTCAAA GGAGTGTGGA 180 

GTGCTTCATC TGGACATCCA CGGGAAGAAG GAAGACATGA GAATCACCCA ACAGTCTTCC 240 

CAGCTATACC TGTGGGACAT GGGTGGTTTT ACAATATTTA AGAACCTGTG GATGAGCCTC 300 

ATACCCAGAG GGAACAAACG CTCCCCAAAA AGAGTTACAG AAACCATCCT GAGAGATTTT 360 

AAGCAGAAGC AAAGTTCAAA GATCCAAGAG GAGAGACGAA GAGAGTCTGC AGGACCAAAC 420 

CTCTCTTCAT TCTGGTTTGT GGGGAATGCT GGAAGAGGAG ACAGGCCCCA GATTTGGGCA 480 

GGAAGTAAAC AGTTTTCAGG CTGAGGCCAA TCTGAGCAGG AACATTCCAA TATTTCTTCA 540 

GCTACGTTGT CCCAGCACTT CACTGGTTAA CCTTTTATGT CCACCATTTG TGGATTTCAC 600 

AGCTACTTGT CAATGGTGAA TATTGATCAT CATCATTATC TACTGAGCTG CTACCATATC 660 

CCAGCTACTC tTTGCATGTT GTTCATTATT TTCTCAACAC TCAGCATATT TGCAATATGT 720 

TATGTAATAT CACAGACAAG GAAACTGAAC GCAGAAATGT TTTATTTCTT GCCAAACATC 780 

ACATGAGGAT GAACAATGAA ACCGATTTGA AACCAGGATT GTCTGATTCC AACATCTCTG 840 
GGTCCTTTTT CACTCTGATA TGCTGCAATT AAAAAGCCAT TTCTAAGACT GT 



SFQ ID NO:202 PAA6 Protein sequence: 
Protein Accession!: none found 

1 11 21 31 41 51 

I I I I I I 

MTYSYSFFRP ELIVNHLNYV HSEANRRTKT KTLLSLLSFL DETSGLSTHL PCLSLSKECG 60 
VLHLDIHGKK EDMRITQQSS QLYLWDMGGF TIFKNLWMSL IPRGNKRSPK RVTETILRDF 120 
KQKQSSKIQE ERRRESAGPN LSSFWFVGNA GRGDRPQIWA GSKQFSG 
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SEQ ID N&203 PAB2 DNA SEQUENCE 

Nuetete Add Accession*: XM.050197 

Coding sequence: 310-1 971 (underlined sequences correspond to start and stop codons) 

5 1 ' 11 21 31 41 51 

III III 

TCACACGTGC CAAGGGGCTG GCTCAGCGGA ACCAGCCTGC ACGCGCTGGC TCCGGGTGAC 60 

AGCCGCGCGC CTCGGCCAGG ATCTGAGTGA TGAGACGTGT CCCCACTGAG GTGCCCCACA 120 

GCAGCAGGTG TTGAGCATGG GCTGAGAAGC TGGACCGGCA CCAAAGGGCT GGCAGAAATG 180 

10 GGCGCCTGGC TGATTCCTAG GCAGTTGGCG GGAGCAAGGA GGAGAGGCCG CAGCTTCTGG 240 

AGCAGAGCCG AGACGAAGCA GTTCTGGAGT GCCTGAACGG CCCCCTGAGC CCTACCCGCC 300 

TGGCCCAC TA TGG TCCAGAG GCTGTGGGTG AGCCGCCTGC TGCGGCACCG GAAAGCCCAG 360 

CTCTTGCTGG TCAACCTGCT AACCTTTGGC CTGGAGGTGT GTTTGGCCGC AGGCATCACC 420 

TATGTGCCGC CTCTGCTGCT GGAAGTGGGG GTAGAGGAGA AGTTCATGAC CATGGTGCTG 480 

15 GGCATTGGTC CAGTGCTGGG CCTGGTCTGT GTCCCGCTCC TAGGCTCAGC CAGTGACCAC 540 

TGGCGTGGAC GCTATGGCCG CCGCCGGCCC TTCATCTGGG CACTGTCCTT GGGCATCCTG 600 

CTGAGCCTCT TTCTCATCCC AAGGGCCGGC TGGCTAGCAG GGCTGCTGTG CCCGGATCCC 660 

AGGCCCCTGG AGCTGGCACT GCTCATCCTG GGCGTGGGGC TGCTGGACTT CTGTGGCCAG 720 

. GTGTGCTTCA CTCCACTGGA GGCCCTGCTC TCTGACCTCT TCCGGGACCC GGACCACTGT 780 

20 CGCCAGGCCT ACTCTGTCTA TGCCTTCATG ATCAGTCTTG GGGGCTGCCT GGGCTACCTC 840 

CTGCCTGCCA TTGACTGGGA CACCAGTGCC CTGGCCCCCT ACCTGGGCAC CCAGGAGGAG 900 

TGCCTCTTTG GCCTGCTCAC CCTCATCTTC CTCACCTGCG TAGCAGCCAC ACTGCTGGTG 960 

GCTGAGGAGG CAGCGCTGGG CCCCACCGAG CCAGCAGAAG GGCTGTCGGC CCCCTCCTTG 1020 

TCGCCCCACT GCTGTCCATG CCGGGCCCGC TTGGCTTTCC GGAACCTGGG CGCCCTGCTT 1080 

25 CCCCGGCTGC ACCAGCTGTG CTGCCGCATG CCCCGCACCC TGCGCCGGCT CTTCGTGGCT 1140 

GAGCTGTGCA GCTGGATGGC ACTCATGACC TTCACGCTGT TTTACACGGA TTTCGTGGGC 1200 

GAGGGGCTGT ACCAGGGCGT GCCCAGAGCT GAGCCGGGCA CCGAGGCCCG GAGACACTAT 1260 

GATGAAGGCG TTCGGATGGG CAGCCTGGGG CTGTTCCTGC AGTGCGCCAT CTCCCTGGTC 1320 

ori TTCTCTCTGG TCATGGACCG GCTGGTGCAG CGATTCGGCA CTCGAGCAGT CTATTTGGCC 1380 

30 AGTGTGGCAG CTTTCCCTGT GGCTGCCGGT GCCACATGCC TGTCCCACAG TGTGGCCGTG 1440 

GTGACAGCTT CAGCCGCCCT CACCGGGTTC ACCTTCTCAG CCCTGCAGAT CCTGCCCTAC 1500 

ACACTGGCCT CCCTCTACCA CCGGGAGAAG CAGGTGTTCC TGCCCAAATA CCGAGGGGAC 1560 

ACTGGAGGTG CTAGCAGTGA GGACAGCCTG ATGACCAGCT TCCTGCCAGG CCCTAAGCCT 1620 

GGAGCTCCCT TCCCTAATGG ACACGTGGGT GCTGGAGGCA GTGGCCTGCT CCCACCTCCA 1680 

35 CCCGCGCTCT GCGGGGCCTC TGCCTGTGAT GTCTCCGTAC GTGTGGTGGT GGGTGAGCCC 1740 

ACCGAGGCCA GGGTGGTTCC GGGCCGGGGC ATCTGCCTGG ACCTCGCCAT CCTGGATAGT 1800 

GCCTTCCTGC TGTCCCAGGT GGCCCCATCC CTGTTTATGG GCTCCATTGT CCAGCTCAGC I860 

CAGTCTGTCA CTGCCTATAT GGTGTCTGCC GCAGGCCTGG GTCTGGTCGC CATTTACTTT 1920 

GCTACACAGG TAGTATTTGA CAAGAGCGAC TTGGCCAAAT ACTCAGC GTA GA AAACTTCC 1980 

40 AGCACATTGG GGTGGAGGGC CTGCCTCACT GGGTCCCAGC TCCCCGCTCC TGTTAGCCCC 2040 

ATGGGGCTGC CGGGCTGGCC GCCAGTTTCT GTTGCTGCCA AAGTAATGTG GCTCTCTGCT 2100 

GCCACCCTGT GCTGCTGAGG TGCGTAGCTG CACAGCTGGG GGCTGGGGCG TCCCTCTCCT 2160 

CTCTCCCCAG TCTCTAGGGC TGCCTGACTG GAGGCCTTCC AAGGGGGTTT CAGTCTGGAC 2220 

TTATACAGGG AGGCCAGAAG GGCTCCATGC ACTGGAATGC GGGGACTCTG CAGGTGGATT 2280 

45 ACCCAGGCTC AGGGTTAACA GCTAGCCTCC TAGTTGAGAC ACACCTAGAG AAGGGTTTTT 2340 

GGGAGCTGAA TAAACTCAGT CACCTGGTTT CCCATCTCTA AGCCCCTTAA CCTGCAGCTT 2400 

CGTTTAATGT AGCTCTTGCA TGGGAGTTTC TAGGATGAAA CACTCCTCCA TGGGATTTGA 2460 

ACATATGAAA GTTATTTGTA GGGGAAGAGT CCTGAGGGGC AACACACAAG AACCAGGTCC 2520 

CCTCAGCCCC ACAGGCACTG G TCTTTTTTC CTOGANTCCA CCCCCCCCCT CTTTACCCTT 2580 

50 TT 



55 



SE Q ID HQ t 204 PAB 2 pigtfn sequence; 
Protein Accession f: XP.050197 



1 11 21 31 41 51 

III II! 

rA MVQRLWVSRL LRHRKAQLLL VNLLTPGLEV CLAAGITYVP PLLLEVGVEE KFMTMVLGIG 60 

60 PVLGLVCVPL LGSASDHWRG RYGRRRPFIW ALSLGILLSL FLIPRAGWLA GLLCPDPRPL 120 

ELALLILGVG LLDFCGQVCF TPLEALLSDL FRDPDHCRQA YSVYAFMISL GGCLGYLLPA 180 

IDWDTSALAP YLGTQEECLF GLLTLIFLTC VAATLLVAEB AALGPTEPAE GLSAPSLSPH 240 

CCPCRARLAP RNLGALLPRL HQLCCRMPRT LRRLFVAELC SWMALMTFTL FYTDFVGEGL 300 

£ YQGVPRAEPG TFARRHYDEG VKMGSLGLFL QCAISLVFSL VMDRLVQRFG TRAVYLASVA 360 

65 AFPVAAGATC LSHSVAWTA SAALTGFTFS ALQILPYTLA SLYHREKQVF LPKYRGDTGG 420 

ASSEDSLHTS FLPGPKPGAP FPNGHVGAGG SGLLPPPPAI* CGASACDVSV RVWGBPTEA 480 

RWPGRGICL DLAILDSAFL LSQVAPSLFM GSIVQLSQSV TAYMVSAAGL GLVAIYFATQ 540 
WFDKSDLAK YSA 

70 SEQ 10 NO:205 PAJ3 DNA SEQUENCE 

Nucleic Add Accession t AK002126 

Coding sequence: 1-1593 (underlined sequences correspond to start and stop codons) 



75 i ii 2i 

I I I 

ATGG TTCGCC GGGGGCTGCT TGCGTGGATT 
TGCTGTGCTA TCTCTGTCCT GTACATGTTG 
or . CTGGCACTGC CCAGGGCCAA CAGCCCCACG 

oO GAGTGGGAGG AGCAGCACCG CAACTACGTG 



31 41 51 
I I I 

TCCCGGGTGG TGGTTTTGCT GGTGCTCCTC 60 

GCCTGCACCC CAAAAGGTGA CGAGGAGCAG 120 

GGGAAGGAGG GGTACCAGGC CGTCCTTCAG 180 

AGCAGCCTGA AGCGGCAGAT CGCACAGCTC 240 
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AAGGAGGAGC TGCAGGAGAG GAGTGAGCAG CTCAGGAATG GGCAGTACCA AGCCAGCGAT 300 

GCTGCTGGCC TGGGTCTGGA CAGGAGCCCC CCAGAGAAAA CCCAGGCCGA CCTCCTGGCC 360 

TTCCTGCACT CGCAGGTGGA CAAGGCAGAG GTGAATGCTG GCGTCAAGCT GGCCACAGAG 420 

- TATGCAGCAG TGCCTTTCGA TAGCTTTACT CTACAGAAGG TGTACCAGCT GGAGACTGGC 480 

5 CTTACCCGCC ACCCCGAGGA GAAGCCTGTG AGGAAGGACA AGCGGGATGA GTTGGTGGAA 540 

GCCATTGAAT CAGCCTTGGA GACCCTGAAC AATCCTGCAG AGAACAGCCC CAATCACCGT 600 

CCTTACACGG CCTCTGATTT CATAGAAGGG ATCTACCGAA CAGAAAGGGA CAAAGGGACA 660 

TTGTATGAGC TCACCTTCAA AGGGGACCAC AAACACGAAT TCAAACGGCT CATCTTATTT 720 

A CGACCATTCG GCCCCATCAT GAAAGTGAAA AATGAAAAGC TCAACATGGC CAACACGCTT 780 

1(J ATCAATGTTA TCGTGCCTCT AGCAAAAAGG GTGGACAAGT TCCGGCAGTT CATGCAGAAT 840 

TTCAGGGAGA TGTGCATTGA GCAGGATGGG AGAGTCCATC TCACTGTTGT TTACTTTGGG 900 

AAAGAAGAAA TAAATGAAGT CAAAGGAATA CTTGAAAACA CTTCCAAAGC TGCCAACTTC 960 

AGGAACTTTA CCTTCATCCA GCTGAATGGA GAATTTTCTC GGGGAAAGGG ACTTGATGTT 1020 

GGAGCCCGCT TCTGGAAGGG AAGCAACGTC CTTCTCTTTT TCTGTGATGT GGACATCTAC 1080 

15 TTCACATCTG AATTCCTCAA TACGTGTAGG CTGAATACAC AGCCAGGGAA GAAGGTATTT 1140 

TATCCAGTTC TTTTCAGTCA GTACAATCCT GGCATAATAT ACGGCCACCA TGATGCAGTC 1200 

CCTCCCTTGG AACAGCAGCT GGTCATAAAG AAGGAAACTG GATTTTGGAG AGACTTTGGA 1260 

TTTGGGATGA CGTGTCAGTA TCGGTCAGAC TTCATCAATA TAGGTGGGTT TGATCTGGAC 1320 

ATCAAAGGCT- GGGGCGGAGA GGATGTGCAC CTTTATCGCA AGTATCTCCA CAGCAACCTC 1380 

20 ATAGTGGTAC GGACGCCTGT GCGAGGACTC TTCCACCTCT GGCATGAGAA GCGCTGCATG 1440 

GACGAGCTGA CCCCCGAGCA GTACAAGATG TGCATGCAGT CCAAGGCCAT GAACGAGGCA 1500 

TCCCACGGCC AGCTGGGGAT GCTGGTGTTC AGGCACGAGA TAGAGGCTCA CCTTCGCAAA 1560 
CAGAAACAGA AGACAAGTAG CAAAAAAACA TGA 



25 



35 
40 



45 
50 
55 
60 
65 
70 
75 
80 



SEQ ID NO:206 PAJ3 Protein seouence: 
Protein Accession #: NP OB084 1 



^ l 11 21 31 41 51 

30 | | | | I I 

MVRRGLLAWI SRVWLLVLL CCAISVLYML ACTPKGDEEQ LALPRANSPT GKEGYQAVLQ 60 

EWEEQHRNYV SSLKRQIAQL KEELQERSEQ LRNGQYQASD AAGLGLDR5P PEKTQADLLA 120 

FLHSQVDKAE VKAGVKLATE YAAVPFDSFT LQKVYQLETG LTRHPEEKPV RKDKHDELVE 180 

AIESALETLN NPAENSPNHR PYTASDFIEG IYRTERDKGT LYELTFKGDH KHEFKRLILF 240 

RPFGPIMKVK NEKLNMANTL INVTVPLAKR VDKFRQFMQN FREMCIEQDG RVHLTWYFG 300 

KEEINEVKGI LENTSKAANF RNFTFIQLNG EFSRGKGLDV GARFWKGSNV LLFFCDVDIY 360 

FTSEFLNTCR LNTQPGKKVF YPVLFSQYNP GIIYGHHDAV PPLEQQLVIK KETGFWRDFG 420 

FGMTCQYRSD FINIGGFDLD IKGWGGEDVH LYRKYLHSNL IWRTPVRGL FHLWHEKRCM 480 
DELTPEQYKM CMQSKAMNEA SHGQLGMLVF RHEEEAHLRK QKQKTSSKKT 



SEQ ID NO-.207 PAJ5 DNA SEOUENCE 

Nucleic Add Accession*: AF189723 

Coding sequence: 1-2712 (undefined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGATTCCTG TATTGACATC AAAAAAAGCA AGTGAATTAC CAGTCAGTGA AGTTGCAAGC 60 

ATTCTCCAAG CTGATCTTCA GAATGGTCTA AACAAATGTG AAGTTAGTCA TAGGCGAGCC 120 

TTTCATGGCT GGAATGAGTT TGATATTAGT GAAGATGAGC CACTGTGGAA GAAGTATATT 180 

TCTCAGTTTA AAAATCCCCT TATTATGCTG CTTCTGGCTT CTGCAGTCAT CAGTGTTTTA 240 

ATGCATCAGT TTGATGATGC CGTCAGTATC ACTGTGGCAA TACTTATCGT TGTTACAGTT 300 

GCCTTTGTTC AGGAATATCG TTCAGAAAAA TCTCTTGAAG AATTGAGTAA ACTTGTGCCA 360 

CCAGAATGCC ATTGTGTGCG TGAAGGAAAA TPGGAGCATA CACTTGCCCG AGACTTGGTT 420 

CCAGGTGATA CAGTTTGCCT TTCTGTTGGG GATAGAGTTC CTGCTGACTT ACGCTTGTTT 480 

GAGGCTGTG6 ATCTTTCCAT TGATGAGTCC AGCTTGACAG GTGAGACAAC GCCTTGTTCT 540 

AAGGTGACAG CTCCTCAGCC AGCTGCAACT AATGGAGATC TTGCATCGAG AAGTAACATT 600 

GCCTTTATGG GAACACTGGT CAGATGTGGC AAAGCAAAGG GTGTTGTCAT TGGAACAGGA 660 

GAAAATTCTG AATTTGGGGA GGTTTTTAAA ATGATGCAAG CAGAAGAGGC ACCAAAAACC 720 

OCTCTGCAGA AGAGCATGGA CCTCTTAGGA AAACAACTTT CCTTTTACTC CTTTGGTATA 780 

ATAGGAATCA TCATGTTGGT TGGCTGGTFA CTGGGAAAAG ATATCCTGGA AATGTTTACT 840 

ATTAGTGTAA GTTTGGCTGT AGCAGCAATT CCTGAAGGTC TCCCCATTGT GGTCACAGTG 900 

ACGCTAGCTC TTGGTGTTAT GAGAATGGTG AAGAAAAGGG CCATTGTGAA AAAGCTGCCT 960 

ATTGTTGAAA CTCTGGGCTG CTGTAATGTG ATTTGTTCAG ATAAAACTGG AACACTGACG 1020 

AAGAATGAAA TGACTGTTAC TCACATATTT ACTTCAGATG GTCTGCATGC TGAGGTTACT 1080 

GGAGTTGGCT ATAATCAATT TCGGGAAGTG ATTGTTGATG GTGATGTTGT TCATGGATTC 1140 

TATAACCCAG CTGTTAGCAG AATTGTTGAG GCGGGCTGTG TGTGCAATGA TGCTGTAATT 1200 

AGAAACAATA CTCTAATGGG GAAGCCAACA GAAGGGGCCT TAATTGCTCT TGCAATGAAG 1260 

ATGGGTCTTG ATGGACTTCA ACAAGACTAC ATCAGAAAAG CTGAATACCC TTTTAGCTCT 1320 

GAGCAAAAGT GGATGGCTGT TAAGTGTGTA CACCGAACAC AGCAGGACAG ACCAGAGATT 1380 

TGTTTTATGA AAGGTGCTTA CGAACAAGTA ATTAAGTACT GTACTACATA CCAGAGCAAA 1440 

GGGCAGACCT TGACACTTAC TCAGCAGCAG AGAGATGTGT ACCAACAAGA GAAGGCACGC 1500 

ATGGGCTCAG CGGGACTCAG AGTTCTTGCT TTGGCTTCTG GTCCTGAACT GGGACAGCTG 1560 

ACATTTCTTG GCTTGGTGGG AATCATTGAT CCACCTAGAA CTGGTGTGAA AGAAGCTGTT 1620 

ACAACACTCA TTGCCTCAGG AGTATCAATA AAAATGATTA CTGGAGATTC ACAGGAGACT 1680 

GCAGTTGCAA TCGCCAGTCG TCTGGGATTG TATTCCAAAA CTTCCCAGTC AGTCTCAGGA 1740 

GAAGAAATAG ATGCAATGGA TGTTCAGCAG CTTTCACAAA TAGTACCAAA GGTTGCAGTA 1800 

TTTTACAGAG CTAGCCCAAG GCACAAGATG AAAATTATTA AGTCGCTACA GAAGAACGGT I860 

TCAGTTGTAG CCATGACAGG AGATGGAGTA AATGATGCAG TTGCTCTGAA GGCTGCAGAC 1920 
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ATTGGAGTTG CGATGGGCCA GACTGGTACA GATGTTTGCA AAGAGGCAGC AGACATGATC 1980 

CTAGTGGATG ATGATTTTCA AACCATAATG TCTGCAATCG AAGAGGGTAA AGGGATTTAT 2040 

AATAACATTA AAAATTTCGT TAGATTCCAG CTGAGCACGA GTATAGCAGC ATTAACTTTA 2100 

ATCTCATTGG CTACATTAAT GAACTTTCCT AATCCTCTCA ATGCCATGCA GATTTTGTGG 2160 

5 ATCAATATTA TTATGGATGG ACCCCCAGCT CAGAGCCTTG GAGTAGAACC AGTGGATAAA 2220 

GATGTCATTC GTAAACCTCC TCGCAACTGG AAAGACAGCA TTTTGACTAA AAACTTGATA 2280 

CTTAAAATAC TTGTTTCATC AATAATCATT GTTTGTGGGA CTTTGTTTGT CTTCTGGCGT 2340 

GAGCTACGAG ACAATGTGAT TACACCTCGA GACACAACAA TGACCTTCAC ATGCTTTGTG 2400 

TTTTTTGACA TGTTCAATGC ACTAAGTTCC AGATCCCAGA CCAAGTCTGT GTTTGAGATT 2460 

10 GGACTCTGCA GTAATAGAAT GTTTTGCTAT GCAGTTCTTG GATCCATCAT GGGACAATTA 2520 

CTAGTTATTT ACTTTCCTCC GCTTCAGAAG GTTTTTCAGA CTGAGAGCCT AAGCATACTG 2580 

GATCTGTTGT TTCTTTTGGG TCTCACCTCA TCAGTGTGCA TAGTCGCAGA AATTATAAAG 2640 

AAGGTTGAAA GGAGCAGGGA AAAGATCCAG AAGCATGTTA GTTCGACATC ATCATCTTTT 2700 
CTTGAAGTAT GA 



15 



SE Q ID Nfr2Q8 P A J5 Pqtfq sequence; 
Protein Accession #: AAF27613 



1 11 21 31 41 51 

20 | | | | | ) 

MIPVLTSKKA SELPVSEVAS ILQADLQNGL NKCEVSHRRA FHGWNEFDIS EDEPLWKXYI 60 

SQFKNPLIML LLASAVISVL MHQFDDAVSI TVAILIWTV AFVQEYRSEK SLEELSKLVP 120 

PECHCVREGK LEHTLARDLV PGDTVCLSVG DRVPADLRLF EAVDL SIDES SLTGETTPCS 160 

KVTAPQPAAT NGDLASRSNI AFMGTLVRCG KAKGWIGTG ENSEFGEVFK MMQAEEAPKT 240 

25 PLQKSMDLLG KQLSFYSFGI IGIIMLVGWL LGKDILEMFT I SVS LAVAAI PEGLPIWTV 300 

TLALGVMRMV KKRAIVKKLP IVETLGCCNV 1CSDKTGTLT KNEMTVTHIF TSDGLHABVT 360 

GVGYNQPGEV IVDGDWHGF YNPAVSRIVE AGCVCNDAVI RNNTLMGKPT EGALIALAMK 420 

MGLDGLQQDY IRKAEYPFSS EQKWMAVKCV HRTQQDRPEI CFMKGAYEQV IKYCTTYQSK 480 

GQTLTLTQQQ RDVYQQEKAR MGSAGLRVLA LASGPELGQL TFLGLVGIID PPRTGVKEAV 540 

30 TTLIASGVSI KMITGDSQET AVAIASRLGL YSKTSQSVSG EEIDAMDVOQ LSQIVPKVAV 600 

FYRASPRHKM KIIKSLQKNG SWAMTGDGV NDAVALKAAD IGVAKGQTGT DVCKEAADMI 660 

LVDDDFQTIM SAIEEGKGIY NNIKNFVRFQ LSTS1AALTL ISLATLMNFP NPLNAMQILW 720 

INIIMDGPPA QSLGVEPVDK DVTRKPPRNW KDSILTKNLI LKILVSSIII VCGTLFVFWR 780 

ELRDNVITPR DTTMTFTCFV FFDMFNALSS RSQTKSVFEI GLCSNRMFCY AVLGSIMGQL 840 

35 LVTYFPPLQK VFQTESLSIL DLLFLLGLTS SVCIVAEIIK KVERSREKIQ KHVSSTSSSF 900 
LEV 

SEQ 10 NCM09 PAV4 VARIANT 1 DNA SEQUENCE 

Nucleic Acid Accession*: N62096 
40 Coding sequence: 1-1284 (underlined sequences correspond lo start and stop codons) 

1 11 21 31 41 51 

A - \ I I I I I 

4j ATCGGCTACC AGAGGCAGGA gcctgtcatc CCGCCGCAGA GAGGATTGCC TTATTCAATG „ 60 

AAGCAAGCTG GGTTTCCTTT GGGAATATTG CTTTTATTCT GGGTTTCATA TGTTACAGAC 120 

TTTTCCCTTG TTTTATTGAT AAAAGGAGGG GCCCTCTCTG GAACAGATAC CTACCAGTCT 180 

TTGGTCAATA AAACTTTCGG CTTTCCAGGG TATCTGCTCC TCTCTGTTCT TCAGTTTTTG 240 

TATCCTTTTA TAGCAATGAT AAGTTACAAT ATAATAGCTG GAGATACTTT GAGCAAAGTT 300 

50 TTTCAAAGAA TCCCAGGAGT TGATCCTGAA AACGTGTTTA TTGGTCGCCA CTTCATTATT 360 

GGACTTTCCA CAGTTACCTT TACTCTGCCT TTATCCTTGT ACCGAAATAT AGCAAAGCTT 420 

GGAAAGGTCT CCCTCATCTC TACAGGTTTA ACAACTCTGA TTCTTGGAAT TGTAATGGCA 480 

AGGGCAATTT CACTGGGTCC ACACATACCA AAAACAGAAG ACGCTTGGGT ATTTGCAAAG 540 

CCCAATGCCA TTCAAGCGGT CGGGGTTATG TCTTTTGCAT TTATTTGCCA CCATAACTCC 600 

55 TTCTTAGTTT ACAGTTCTCT AGAAGAACCC ACAGTAGCTA AGTGGTCCCG CCTTATCCAT 660 

ATGTCCATCG TGATTTCTGT ATTTATCTGT ATATTCTTTG CTACATGTGG ATACTTGACA 720 

TTTACTGGCT TCACCCAAGG GGACTTATTT GAAAATTACT GCAGAAATGA TGACCTGGTA 780 

ACATTTGGAA GATTTTGTTA TGGTGTCACT GTCATTTTGA CATACCCTAT GGAATGCTTT 840 

- GTGACAAGAG AGGTAATTGC CAATGTGTTT TTTGGTGGGA ATCTTTCATC GGTTTTCCAC 900 

60 ATTGTTGTAA CAGTGATGGT CATCACTGTA GCCACGCTTG TGTCATTGCT GATTGATTGC 960 

CTCGGGATAG TTCTAGAACT CAATGGTGTG CTCTGTGCAA CTCCCCTCAT TTTTATCATT 1020 

CCATCAGCCT GTTATCTGAA ACTGTCTGAA GAACCAAGGA CACACTCCGA TAAGATTAfG 1080 

TCTTGTGTCA TGCTTCCCAT TGGTGCTGTG GTGATGGTTT TTGGATTCGT CATGGCTATT 1140 

^ ACAAATACTC AAGACTGCAC CCATGGGCAG GAAATGTTCT ACTGCTTTCC TGACAATTTC 1200 

65 TCTCTCACAA ATACCTCAGA GTCTCATGTT CAGCAGACAA CACAACTTTC TACTTTAAAT 1260 
ATTAGTATCT TTCAACTCGA GTAA 

?EQ p N<M10 PAW V9"?nt 1 Protein sequence; 
/ U Protein Accession #: none found 

l n 21 31 41 51 

I I I I I I 

MGYQRQEPVI PPQRGLPYSM KQAGFPLGIL LLFWVSYVTD FSLVLLIKGG ALSGTDTYQS 60 

75 LVNKTFGFPG YLLLSVLQFL YPFIAMISYN IIAGDTLSKV FQRIPGVDPE NVFIGRHFII 120 

GLSTVTFTLP LSLYRNIAKL GKVSLI STGL TTLILGIVKA RAISLGPHIP KTEDAWVFAK 180 

PNAIQAVGVK SFAFICHHNS FLVYS SLEEP TVAKWSRLIH MSIVISVFIC IFFATCGYLT 240 

FTGFTOGDLF ENVCRNDDLV TFGRFCYGVT VILTYPHECF VTREVIANVF FGGNLSSVFH 300 

IWTVMVITV ATLVSLL1DC LGIVLELNGV LCATPLIFII PSACYLKLSE EPRTHSDKIM 360 

80 SCVHLPIGAV VHVFGFVMAI TNTQDCTHGQ EMFYCFPDNP SLTNTSESHV QQTTQLSTLN 420 
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ISIPQLE 

SEOIDNO:211 PAV4 VARIANT 2 DNA SEQUENCE 

Nucleic Add Accession 9: N6209B 

Coding sequence: 1-1203 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

III I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGT TTTCCCTTGT TTTATTGATA 60 

AAAGGAGGGG CCCTCTCTGG AACAGATACC TACCAGTCTT TGGTCAATAA AACTTTCGGC 120 

TTTCCAGGGT ATCTGCTCCT CTCTGTTCTT CAGTTTTTGT ATCCTTTTAT AGCAATGATA 180 

AGTTACAATA TAATAGCTGG AGATACTTTG AGCAAAGTTT TTCAAAGAAT CCCAGGAGTT 240 

GATCCTGAAA ACGTGTTTAT TGGTCGCCAC TTCATTATTG GACTTTCCAC AGTTACCTTT 300 

ACTCTGCCTT TATCCTTGTA CCGAAATATA GCAAAGCTTG GAAAGGTCTC CCTCATCTCT 360 

ACAGGTTTAA CAACTCTGAT TCTTGGAATT GTAATGGCAA GGGCAATTTC ACTGGGTCCA 420 

CACATACCAA AAACAGAAGA CGCTTGGGTA TTTGCAAAGC CCAATGCCAT TCAAGCGGTC 480 

GGGGTTATGT CTTTTGCATT TATTTGCCAC CATAACTCCT TCTTAGTTTA CAGTTCTCTA 540 

GAAGAACCCA CAGTAGCTAA GTGGTCCCGC CTTATCCATA TGTCCATCGT GATTTCTGTA 600 

TTTATCTGTA TATTCTTTGC TACATGTGGA TACTTGACAT TTACTGGCTT CACCCAAGGG 660 

GACTTATTTG AAAATTACTG CAGAAATGAT GACCTGGTAA CATTTGGAAG ATTTTGTTAT 720 

GGTGTCACTG TCATTTTGAC ATACCCTATG GAATGCTTTG TGACAAGAGA GGTAATTGCC 780 

AATGTGTTTT TTGGTGGGAA TCTTTCATCG GTTTTCCACA TTGTTGTAAC AGTGATGGTC 840 

ATCACTGTAG CCACGCTTGT GTCATTGCTG ATTGATTGCC TCGGGATAGT TCTAGAACTC 900 

AATGGTGTGC TCTGTGCAAC TCCCCTCATT TTTATGATTC CATCAGCCTG TTATCTGAAA 960 

CTGTCTGAAG AACCAAGGAC ACACTCCGAT AAGATTATGT CTTGTGTCAT GCTTCCCATT 1020 

GGTGCTGTGG TGATGGTTTT TGGATTCGTC ATGGCTATTA CAAATACTCA AGACTGCACC 1080 

CATGGGCAGG AAATGTTCTA CTGCTTTCCT GACAATTTCT CTCTCACAAA TACCTCAGAG 1140 

TCTCATGTTC AGCAGACAAC ACAACTTTCT ACTTTAAATA TTAGTATCTT TCAACTCGAG 1200 
TAA 



SEQ ID NO:212 PAV4 Variant 2 Protein sequence: 
Protein Accession #: none found 



1 11 21 31 41 51 

1 I I I t I 

MGYQRQEPVI PPQFSLVLLI KGGALSGTDT YQSLVNKTFG FPGYLLLSVL QFLYPFIAMI 60 

SYNIIAGDTL SKVFQRIPGV DPENVFIGRH F1IGLSTVTF TLPLSLYRNI AKLGKVSLIS 120 

TGLTTLILGI VMARAISLGP HIPKTEDAWV FAKPNAIQAV GVMSFAFICH HNSFLVYSSL 180 

EEPTVAKWSR L1HMSIVISV FICIFFATCG YLTFTGFTQG DLFENYCRND DLVTFGRFCY 240 

GVTVILTYPM ECFVTREVIA NVFFGGNLSS VFHIWTVMV ITVATLVSLL IDCL/GIVLEL 300 

NGVLCATPLI FIIPSACYLK LSEEPRTHSD KIMSCVMLPI GAWMVFGFV MAITNTQDCT 360 
HGQEMFYCFP DNFSLTNTSE SHVQQTTQLS TLN1SIPQU£ 

SEQ ID N0:213 PAV4 VARIANT 3 DNA SEQUENCE 

Nucleic Add Accession*: N62096 

Coding sequence: 1-1 140 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGG TCAATAAAAC TTTCGGCTTT 60 

CCAGGGTATC TGCTCCTCTC TGTTCTTCAG TTTTTGTATC CTTTTATAGC AATGATAAGT 120 

TACAATATAA TAGCTGGAGA TACTTTGAGC AAAGTTTTTC AAAGAATCCC AGGAGTTGAT 180 

CCTGAAAACG TGTTTATTGG TCGCCACTTC ATTATTGGAC TTTCCACAGT TACCTTTACT 240 

CTGCCTTTAT CCTTGTACCG AAATATAGCA AAGCTTGGAA AGGTCTCCCT CATCTCTACA 300 

GGTTTAACAA CTCTGATTCT TGGAATTGTA ATGGCAAGGG CAATTTCACT GGGTCCACAC 360 

ATACCAAAAA CAGAAGACGC TTGGGTATTT GCAAAGCCCA ATGCCATTCA AGCGGTCGGG 420 

GTTATGTCTT TTGCATTTAT TTGCCACCAT AACTCCTTCT TAGTTTACAG TTCTCTAGAA 480 

GAACCCACAG TAGCTAAGTG GTCCCGCCTT ATCCATATGT CCATCGTGAT TTCTGTATTT 540 

ATCTGTATAT TCTTTGCTAC ATGTGGATAC TTGACATTTA CTGGCTTCAC CCAAGGGGAC 600 

TTATTTGAAA ATTACTGCAG AAATGATGAC CTGGTAACAT TTGGAAGATT TTGTTATGGT 660 

GTCACTGTCA TTTTGACATA CCCTATGGAA TGCTTTGTGA CAAGAGAGGT AATTGCCAAT 720 

GTGTTTTTTG GTGGGAATCT TTCATCGGTT TTCCACATTG TTGTAACAGT GATGGTCATC 780 

ACTGTAGCCA CGCTTGTGTC ATTGCTGATT GATTGCCTCG GGATAGTTCT AGAACTCAAT 840 

GGTGTGCTCT GTGCAACTCC CCTCATTTTT ATCATTCCAT CAGCCTGTTA TCTGAAACTG 900 

TCTGAAGAAC CAAGGACACA CTCCGATAAG ATTATGTCTT GTGTCATGCT TCCCATTGGT 960 

GCTGTGGTGA TGGTTTTTGG ATTCGTCATG GCTATTACAA ATACTCAAGA CTGCACCCAT 1020 

GGGCAGGAAA TGTTCTACTG CTTTCCTGAC AATTTCTCTC TCACAAATAC CTCAGAGTCT 1080 
CATGTTCAGC AGACAACACA ACTTTCTACT TTAAATATTA GTATCTTTCA ACTCGAGTAA 

SEQ ID NO:214 PAV4 Variant 3 Protein sequence: 
Protein Accession nonetound 

l 11 21 31 41 51 

I I I I I I 
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MGYQRQEPVI PPQVNKTFGF PGYLLLSVLQ FLYPFIAMIS YNIIAGDTLS KVFQRIPGVD 60 

PENVPIGRHF IIGLSTVTFT LPLSLYRNIA KLGKVSLIST GLTTLILGIV MARAISLGPH 120 

IPKTEDAWVP AKPNAIQAVG VMSFAFICHH NSFLVYSSLE EPTVAKWSRL IHMSIVISVF 180 

ICIFFATCGY LTFTGFTQGD LFENYCRNDD LVTFGRFCYG VTVILTYPME CFVTREVIAN 240 

VFFGGNLSSV FHIWTVMVI TVATLVSLLI DCLGIVLELN GVLCATPLIF IIPSACYLKL 300 

SEEPRTHSDK IMSCVMLPIG AWMVFGFVM AITNTQDCTH GQEMFYCFPD NFSLTNTSES 360 
HVQQTTQLST LNISIFQLE 



SEQ ID NO:215 PAV4 VARIANT 4 DNA SEQUENCE: 

Nucleic Add Accession ft: N62095 

Coding sequence: 1-1 389 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGA GAGATTTAGA TGACAGAGAA 6C 
ACCCTTGTTT CTGAACATGA GTATAAAGAG AAAACCTGTC AGTCTGCTGC TCTTTTTAAT 12 C 
GTTGTCAACT CGATTATAGG ATCTGGTATA ATAGGATTGC CTTATTCAAT GAAGCAAGCT 180 
GGGTTTCCTT TGGGAATATT GCTTTTATTC TGGGTTTCAT ATGTTACAGA CTTTTCCCTT 24C 
GTTTTATTGA TAAAAGGAGG GGCCCTCTCT GGAACAGATA CCTACCAGTC TTTGGTCAAT 30C 
AAAACTTTCG GCTTTCCAGG GTATCTGCTC CTCTCTGTTC TTCAGTTTTT GTATCCTTTT 36C 
ATAGCAATGA TAAGTTACAA TATAATAGCT GGAGATACTT TGAGCAAAGT TTTTCAAAGA 42C 
ATCCCAGGAG TTGATCCTGA AAACGTGTTT ATTGGTCGCC ACTTCATTAT TGGACTTTCC 48C 
ACAGTTACCT TTACTCTGCC TTTATCCTTG TACCGAAATA TAGCAAAGCT TGGAAAGGTC 54C 
TCCCTCATCT CTACAGGTTT AACAACTCTG ATTCTTGGAA TTGTAATGGC AAGGGCAATT 600 
TCACTGGGTC CACACATACC AAAAACAGAA GACGCTTGGG TATTTGCAAA GCCCAATGCC 660 
ATTCAAGCGG TCGGGGTTAT GTCTTTTGCA TTTATTTGCC ACCATAACTC CTTCTTAGTT 720 
TACAGTTCTC TAGAAGAACC CACAGTAGCT AAGTGGTCCC GCCTTATCCA TATGTCCATC 780 
GTGATTTCTG TATTTATCTG TATATTCTTT GCTACATGTG GATACTTGAC ATTTACTGGC 840 
TTCACCCAAG GGGACTTATT TGAAAATTAC TGCAGAAATG ATGACCTGGT AACATTTGGA 900 
AGATTTTGTT ATGGTGTCAC TGTCATTTTG ACATACCCTA TGGAATGCTT TGTGACAAGA 960 

GAGGTAATTG CCAATGTGTT TTTTGGTGGG AATCTTTCAT CGGTTTTCCA CATTGTTGTA 1020 

ACAGTGATGG TCATCACTGT AGCCACGCTT GTGTCATTGC TGATTGATTG CCTCGGGATA 1080 

GTTCTAGAAC TCAATGGTGT GCTCTGTGCA ACTCCCCTCA TTTTTATCAT TCCATCAGCC 1140 

TGTTATCTGA AACTGTCTGA AGAACCAAGG ACACACTCCG ATAAGATTAT GTCTTGTGTC 1200 

ATGCTTCCCA TTGGTGCTGT GGTGATGGTT TTTGGATTCG TCATGGCTAT TACAAATACT 1260 

CAAGACTGCA CCCATGGGCA GGAAATGTTC TACTGCTTTC CTGACAATTT CTCTCTCACA 1320 

AATACCTCAG AGTCTCATGT TCAGCAGACA ACACAACTTT CTACTTTAAA TATTAGTATC 1380 
TTTCAATGA 



$EQ |D NQ:216 PAV4 Variant 4 protein sequence: 
Protein Accession #: none found 



1 11 21 31 41 51 

I I I I I I 

MGYQRQEPVI PPQRDLDDRE TLVSEHEYKE KTCQSAALFN WNSIIGSGI IGLPYSMKQA 60 

GFPLGILLLF WVSYVTDFSL VLL1KGGALS GTDTYQSLVN KTFGFPGYLL LSVLQFLYPF 120 

IAMISYNIIA GDTLSKVFQR IPGVDPENVF IGRIIFIIGLS TVTFTLPLSL YRNIAKLGKV 180 

SLISTGLTTL ILGIVMARAI SLGPHIPKTE DAVJVFAKPNA IQAVGVMSFA FICHHNSFLV 240 

YSSLEEPTVA KWSRLIHMSI VISVFICIFF ATCGYLTFTG FTCGDLFENY CRNDDLVTFG 300 

RFCYGVTVIL TYPMECFVTR EVIANVFFGG NLSSVFHIW TVMVITVATL VSLLIDCLGI 360 

VLELNGVLCA TPLIFIIPSA CYLKLSEEPR THSDKIMSCV MLPIGAWMV FGFVMAITOT 420 
QDCraGQEMF YCFPDNFSLT NTSESHVQQT TQLSTLNISI FQ 

SEQ 10 N0-.217 PAV9 DNA SEQUENCE 

Nucteic Acid Accession!: NR.017636 

Coding sequence: 1-3501 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGAGGATG CCTTCGGGGC AGCCGTGGTG ACCGTGTGGG ACAGCGATGC ACACACCACG 60 

GAGAAGCCCA CCGATGCCTA CGGAGAGCTG GACTTCACGG GGGCCGGCCG CAAGCACAGC 120 

AATTTCCTCC GGCTCTCTGA CCGAACGGAT CCAGCTGCAG TTTATAGTCT GGTCACACGC 180 

ACATGGGGCT TCCGTGCCCC GAACCTGGTG GTGTCAGTGC TGGGGGGATC GGGGGGCCCC 240 

GTCCTCCAGA CCTGGCTGCA GGACCTGCTG CGTCGTGGGC TGGTGCGGGC TGCCCAGAGC 300 

ACAGGAGCCT GGATTGTCAC TGGGGGTCTG CACACGGGCA TCGGCCGGCA TGTTGGTGTG 360 

GCTGTACGGG ACCATCAGAT GGCCAGCACT GGGGGCACCA AGGTGGTGGC CATGGGTGTG 420 

GCCCCCTGGG GTGTGGTCCG GAATAGAGAC ACCCTCATCA ACCCCAAGGG CTCGTTCCCT 480 

GCGAGGTACC GGTGGCGCGG TGACCCGGAG GACGGGGTCC AGTTTCCCCT GGACTACAAC 540 

TACTCGGCCT TCTTCCTGGT GGACGACGGC ACACACGGCT GCCTGGGGGG CGAGAACCGC 600 

TTCCGCTTGC GCCTGGAGTC CTACATCTCA CAGCAGAAGA CGGGCGTGGG AGGGACTGGA 660 

ATTGACATCC CTGTCCTGCT CCTCCTGATT GATGGTGATG AGAAGATGTT GACGCGAATA 720 

GAGAACGCCA CCCAGGCTCA GCTCCCATGT CTCCTCGTGG CTGGCTCAGG GGGAGCTGCG 780 

GACTGCCTGG CGGAGACCCT GGAAGACACT CTGGCCCCAG GGAGTGGGGG AGCCAGGCAA 840 

GGCGAAGCCC GAGATCGAAT CAGGCGTTTC TTTCCCAAAG GGGACCTTGA GGTCCTGCAG 900 

GCCCAGGTGG AGAGGATTAT GACCCGGAAG GAGCTCCTGA CAGTCTATTC TTCTGAGGAT 960 
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GGGTCTGAGG AATTCGAGAC CATAGTTTTG 
GAGGCCTCAG CCTACCTGGA TGAGCTGCGT 
GCCCAGAGTG AACTCTTTCG GGGGGACATC 
CTCATGGACG CCCTGCTGAA TGACCGGCCT 
CTCAGCCTGG GCCACTTCCT GACCCCGATG 
TCCAACTCGC TCATCCGCAA CCTTTTGGAC 
CCAGCCCTAA AAGGGGGAGC TGCGGAGCTC 
ATGCTGCTGG GGAAGATGTG CGCGCCGAGG 
CCAGGCCAGG GCTTCGGGGA GAGCATGTAT 
TCGCTGGATG CTGGCCTCGG GCAGGCCCCC 
CTGAACAGGG CACAGATGGC CATGTACTTC 
GCTCTTGGGG CCTGTTTGCT GCTCCGGGTG 
GCAGCACGGA GGAAAGACCT GGCGTTCAAG 
GAGTGCTATC GCAGCAGTGA GGTGAGGGCT 
TGGGGGGATG CCACTTGCCT CCAGCTGGCC 
CAGGATGGGG TACAGTCTCT GCTGACACAG 
CCCATCTGGG CCCTGGTTCT CGCCTTCTTT 
ACCTTCAGGA AATCAGAAGA GGAGCCCACA 
GTCATTAATG GGGAAGGGCC TGTCGGGACG 
GTCCCGCGCC AGTCGGGCCG TCCGGGTTGC 
CTACGCCGCT GGTTCCACTT CTGGGGCGCG 
AGCTACCTGC TGTTCCTGCT GCTTTTCTCG 
CCGCCCGGCT CCCTGGAGCT GCTGCTCTAT 
CTGCGCCAGG GCCTGAGCGG AGGCGGGGGC 
CATGCCTCAC TGAGCCAGCG CCTGCGCCTC 
CTAGTGGCTC TCACCTGCTT CCTCCTGGGC 
CACCTGGGCC GCACTGTCCT CTGCATCGAC 
ATCTTCACGG TCAACAAACA GCTGGGGCCC 
GACGTGTTCT TCTTCCTCTT CTTCCTCGGC 
GAGGGGCTCC TGAGGCCACG GGACAGTGAC 
CGTCCCTACC TGCAGATCTT CGGGCAGATT 
GAGCACAGCA ACTGCTCGTC GGAGCCCGGC 
GGCACCTGCG TCTCCCAGTA TGCCAACTGG 
CTCGTGGCCA ACATCCTGCT GGTCAACTTG 
AAAGTACAGG GCAACAGCGA TCTCTACTGG 
TTCCACTCTC GGCCCGCGCT GGCCCCGCCC 
CTCAGGCAAT TGTGCAGGCG ACCCCGGAGC 
TTCCGGGTTT ACCTTTCTAA GGAAGCCGAG 
AAGGAGAACT TTCTGCTGGC ACGCGCTAGG 
AAGCGCACGT CCCAGAAGGT GGACTTGGCA 
GAACAGCGCC TGAAAGTGCT GGAGCGGGAG 
GTGGCCGAGG CCCTGAGCCG CTCTGCCTTG 
CTGCCTGGGT CCAAAGAC TG A 



AAGGCCCTTG TGAAGGCCTG TGGGAGCTCG 1020 

TTGGCTGTGG CTTGGAACCG CGTGGACATT 1080 

CAATGGCGGT CCTTCCATCT CGAAGCTTCC 1140 

GAGTTCGTGC GCTT GC TCAT TTCCCACGGC 1200 

CGCCTGGCCC AACTCTACAG CGCGGCGCCC 1260 

CAGGCGTCCC ACAGCGCAGG CACCAAAGCC 1320 

CGGCCCCCTG ACGTGGGGCA TGTGCTGAGG 1380 

TACCCCTCCG GGGGCGCCTG GGACCCTCAC 1440 

CTGCTCTCGG ACAAGGCCAC CTCGCCGCTC 1500 

TGGAGCGACC TGCTTCTTTG GGCACTGTTG 1560 

TGGGAGATGG GTTCCAATGC AGTTTCCTCA 1620 

ATGGCACGCC TGGAGCCTGA CGCTGAGGAG 1680 

TTTGAGGGGA TGGGCGTTGA CCTCTTTGGC 1740 

GCCCGCCTCC TCCTCCGTCG CTGCCCGCTC 1800 

ATGCAAGCTG ACGCCCGTGC CTTCTTTGCC 1860 

AAGTGGTGGG GAGATATGGC CAGCACTACA 1920 

TGCCCTCCAC TCATCTACAC CCGCCTCATC 1980 

CGGGAGGAGC TAGAGTTTGA CATGGATAGT 2040 

GCGGACCCAG CCGAGAAGAC GCCGCTGGGG 2100 

TGCGGGGGCC GCTGCGGGGG GCGCCGGTGC 2160 

CCGGTGACCA TCTTCATGGG CAACGTGGTC 2220 

CGGGTGCTGC TCGTGGATTT CCAGCCGGCG 2280 

TTCTGGGCTT TCACGCTGCT GTGCGAGGAA 2340 

AGCCTCGCCA GCGGGGGCCC CGGGCCTGGC 2400 

TACCTCGCCG ACAGCTGGAA CCAGTGCGAC 2460 

GTGGGCTGCC GGCTGACCCC GGGTTTGTAC 2520 

TTCATGGTTT TCACGGTGCG GCTGCTTCAC 2580 

AAGATCGTCA TCGTGAGCAA GATGATGAAG 2640 

GTGTGGCTGG TAGCCTATGG CGTGGCCACG 2700 

TTCCCAAGTA TCCTGCGCCG CGTCTTCTAC 2760 

CCCCAGGAGG ACATGGACGT GGCCCTCATG 2820 

TTCTGGGCAC ACCCTCCTGG GGCCCAGGCG 2880 

CTGGTGGTGC TGCTCCTCGT CATCTTCCTG 2940 

CTCATTGCCA TGTTCAGTTA CACATTCGGC 3000 

AAGGCGCAGC GTTACCGCCT CATCCGGGAA 3060 

TTTATCGTCA TCTCCCACTT GCGCCTCCTG 3120 

CCCCAGCCGT CCTCCCCGGC CCTCGAGCAT 3180 

CGGAAGCTGC TAACGTGGGA ATCGGTGCAT 3240 

GACAAGCGGG AGAGCGACTC CGAGCGTCTG 3300 

CTGAAACAGC TGGGACACAT CCGCGAGTAC 3360 

GTCCAGCAGT GTAGCCGCGT CCTGGGGTGG 3420 

CTGCCCCCAG GTGGGCCGCC ACCCCCTGAC 3480 



SE.Q |P NQ:21g PAV9 P^i" Sequence; 

Protein Accession #: none found 

1 11 21 31 41 51 

I 1 I I I I 

HEDAFGAAW TVWDSDAHTT EKPTDAYGEL DFTGAGRKHS NFLRLSDRTD PAAVYSLVTR 60 

TWGFRAPNLV VSVLGGSGGP VLQTWLQDLL RRGLVRAAQS TGAWIVTGGL HTGIGRHVGV 120 

AVRDHQMAST GGTKWAHGV APWGWRNRD TLINPKGSFP ARYRWRGDPB DGVQFPLDYN 180 

YSAFFLVDDG THGCLGGENR FRLRLESYIS QQKTGVGGTG IDIPVLLLLI DGDEKMLTRI 240 

ENATQAQLPC LLVAGSGGAA DCLAETLEDT LAPGSGGARO GEARDRIRRF FPKGDLEVLQ 300 

AQVERIMTRK ELLTVYSSED GSEEFETIVL KALVKACGSS EASAYLDELR LAVAWNRVDI 360 

AQSELFRGDI QWRSFHLEAS LMDALLNDRP EFVRLLISHG LSLGHFLTPM RLAQLYSAAP 420 

SNSL1RNLLD QA5HSAGTKA PALKGGAAEL RPPDVGHVLR MLLGKMCAPR YPSGGAWDPH 480 

PGQGFGESMY LLSDKATSPL SLDAGLGQAP WSDLLLWALL LNRAQMAMYF WEMGSNAVSS 540 

ALGACLLLRV MARLEPDAEE AARRKDLAFK FEGMGVDLFG ECYRSSEVRA ARLLLRRCPL 600 

WGDATCLQLA MQADARAFFA QDGVQSLLTQ KWWGDMASTT PIWALVLAFF CPPLIYTRW 660 

TFRKSEEEPT REELEFDMDS VINGEGPVGT ADPAEKTPLG VPRQSGRPGC CGGRCGGRRC 720 

LRRWFHFWGA PVTIFMGNW SYLLFLLLFS RVLLVDFQPA PPGSLELLLY FWAFTLLCEE 780 

LRQGLSGGGG SLASGGPGPG HASLSQRLRL YLADSWNQCD LVALTCFLLG VGCRLTPGLY 840 

HLGRTVLCID FMVFTVRLLH IFTVNKQLGP KIVIVSKMMK DVFFFLFFLG VWLVAYGVAT 900 

EGLLRPRDSD FPSILRRVFY RPYLQIFGQI PQEDMOVALM EHSNCSSEPG FWAHPPGAQA 960 

GTCVSQYANW LWLLLVIFL LVANILLVNL LIAMFSYTFG KVQGNSDLYW KAQRYRLIRE 1020 

FHSRPALAPP FIVISHLRLL LRQLCRRPRS PQPSSPALEH FRVYLSKEAE RKLL.TWESVH 1080 

KENFLLARAR DKRESDSERL KRTSQKVDLA LKQLGHIREY EORLKVLERE VQQCSRVLGW 1140 
VAEALSRSAL LPPGGPPPPD LPGSKD 

SEQ ID NO:219 PBF1 ONA SEQUENCE 

Nxieic Add Accession #: AA054237 

Ckxfing sequence: 1-894 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

ill III 

ATGGAGCCGC GGGCGCTCGT CACGGCGCTC AGCCTCGGCC TCAGCCTGTG CTCCCTGGGG 60 

CTGCTCGTCA CGGCCATCTT CACCGACCAC TGGTACGAGA CCGACCCCCG GCGCCACAAG 120 

GAGAGCTGCG AGCGCAGCCG CGCGGGCGCC GACCCCCCGG ACCAGAAGAA CCGCCTGATG 180 
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CCGCTGTCGC ACCTGCCGCT GCGGGACTCG CCCCCGCTGG GGCGCCGGCT GCTCCCGGGC 240 

GGCCCGGGGC GCGCCGACCC CGAGTCCTGG CGCTCGCTCC TGGGGCTCGG CGGGCTGGAC 300 

GCCGAGTGCG GCCGGCCCCT CTTCCCCACC TACTCGGGCC TCTGGAGGAA GTGCTACTTC 360 

CTGGGCATCG ACCGGGACAT CGACACCCTC ATCCTGAAAG GTATTGCGCA GCGATGCACG 420 

5 GCCATCAAGT ACCACTTTTC TCAGCCCATC CGCTTGCGAA ACATTCCTTT TAATTTAACC 480 

AAGACCATAC AGCAAGATGA GTGGCACCTG CTTCATTTAA GAAGAATCAC TGCTGGCTTC 540 

CTCGGCATGG CCGTAGCCGT CCTTCTCTGC GGCTGCATTG TGGCCACAGT CAGTTTCTTC 600 

TGGGAGGAGA GCTTGACCCA GCACGTGGCT GGACTCCTGT TCCTCATGAC AGGGATATTT 660 

TGCACCATTT CCCTCTGTAC TTATGCCGCC AGTATCTCGT ATGATTTGAA CCGGCTCCCA 720 

10 AAGCTAATTT ATAGCCTGCC TGCTGATGTG GAACATGGTT ACAGCTGGTC CATCTTTTGC 780 

GCCTGGTGCA GTTTAGGCTT TATTGTGGCA GCTGGAGGTC TCTGCATCGC TTATCCGTTT 840 
ATTAGCCGGA CCAAGATTGC ACAGCTAAAG TCTGGCAGAG ACTCCACGGT ATGA 

15 $EQtDNQ:220PBF1 PWfa«qwnre 

Protein Accession #: none found 

l 11 21 31 41 51 

20 MEPRALVTAL SLGLSLCSLG LLVTAIFTDH WYETDPRRHK ESCERSRAGA DPPDQKNRLM 60 

PL5HLPLRD5 PPLGRRLLPG GPGRADPESW RSLLGLGGLD AECGRPLFAT YSGLWRKCYF 120 

LGIDRDIDTL ILKGIAQRCT AIKYHFSQPI RLRNIPFNLT KTIQQDEWHL LHLRRJTAGF 180 

LGMAVAVLLC GCIVATVSFF WEESLTQHVA GLLFLMTGIF CTISLCTYAA SISYDLNRLP 240 
KLIYSLPADV EHGYSWSIFC AWCSLGFIVA AGGLCIAYPF ISRTKIAQLK SGRDSTV 



25 
30 



SEO 10 NO:221 PCI4 DNA SEQUENCE 

Nucleic Acid Accession I: NM_0 1 6570 

Coding sequence: 1-1134 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGA GGCGAC TGAATCGGAA AAAAACTTTA AGTTTGGTAA AAGAGTTGGA TGCCTTTCCG 60 

35 AAGGTTCCTG AGAGCTATGT AGAGACTTCA GCCAGTGGAG GTACAGTTTC TCTAATAGCA 120 

TTTACAACTA TGGCTTTATT AACCATAATG GAATTCTCAG TATATCAAGA TACATGGATG 180 

AAGTATGAAT ACGAAGTAGA CAAGGATTTT TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

ACTGTTGCCA TGAAGTGTCA ATATGTTGGA GCGGATGTAT TGGATTTAGC AGAAACAATG 300 

GTTGCATCTG CAGATGGTTT AGTTTATGAA CCAACAGTAT TTGATCTTTC ACCACAGCAG 360 

40 AAAGAGTGGC AGAGGATGCT GCAGCTGATT CAGAGTAGGC TACAAGAAGA GCATTCACTT 420 

CAAGATGTGA TATTTAAAAG TGCTTTTAAA AGTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

GATGATTCAT CACAGTCTCC AAATGCATGC AGAATTCATG GCCATCTATA TGTCAATAAA 540 

GTAGCAGGGA ATTTTCACAT AACAGTGGGC AAGGCAATTC CACATCCTCG TGGTCATGCA 600 

CATTTGGCAG CACTTGTCAA CCATGAATCT TACAATTTTT CTCATAGAAT AGATCATTTG 660 

45 TCTTTTGGAG AGCTTGTTCC AGCAATTATT AATCCTTTAG ATGGAACTGA AAAAATTGCT 720 

ATAGATCACA ACCAGATGTT CCAATATTTT ATTACAGTTG TGCCAACAAA ACTACATACA 780 

TATAAAATAT CAGCAGACAC CCATCAGTTT TCTGTGACAG AAAGGGAACG TATCATTAAC 840 

CATGCTGCAG GCAGCCATGG AGTCTCTGGG ATATTTATGA AATATGATCT CAGTTCTCTT 900 

ATGGTGACAG TTACTGAGGA GCACATGCCA TTCTGGCAGT TTTTTGTAAG ACTCTGTGGT 960 

50 ATTGTTGGAG GAATCTTTTC AACAACAGGC ATGTTACATG GAATTGGAAA ATTTATAGTT 1020 

GAAATAATTT GCTGTCGTTT CAGACTTGGA TCCTATAAAC CTGTCAATTC TGTTCCTTTT 1080 
GAGGATGGCC ACACAGACAA CCACTTACCT CTTTTAGAAA ATAATACACA TTGA 

55 SEQ ID NO:222 PC14 Protein sequence: 

Protein Accession #: NP_057654 

1 11 21 31 41 51 

, n I I I I I I 

OU MRRLNRKKTL SLVKELDAFP KVPESYVETS ASGGTVSLIA FTTOALLTIM EF5VYQDTWM 60 

KYEYEVDKDF SSKLRINIDI TVAMKCQYVG ADVLDLAETM VASADGLVYE PTVFDLSPQQ 120 

KEWORMLQLI QSRLQEEHSL QDV1FKSAFK STSTALPPRE DDSSQSPNAC RIHGHLYVNK 180 

VAGKFHITVG KAIPHPRGHA HLAALVNHES YNFSHRIDHL SFGELVPAII NPLDGTEKIA 240 

IDHNQMFQYF ITWPTKLHT YKISADTHQF SVTERERIIN HAAGSHGVSG IFMXYDLSSL 300 

65 MVTVTEEHMP FWQFFVRLCG IVGGIFSTTG MLHGIGKFIV EIICCRFRLG SYKPVNSVPF 360 
EDGHTDNHLP LLENNTH 

SEO 10 N0.223PEZ3 DNA SEQUENCE 

70 Nudeic Acid Accession*: NM_001935.1 

Codno sequence: 76-2301 {underlined sequences conespond to start and stop codons) 

1 11 21 31 41 51 

75 | I I I I I 

CGCGCGTCTC CGCCGCCCGC GTGACTTCTG CCTGCGCTCC TTCTCTGAAC GCTCACTTCC 60 

GAGGAGACGC CGACGATGAA GACACCGTGG AAGATTCTTC TGGGACTGCT GGGTGCTGCT 120 

GCGCTTGTCA CCATCATCAC CGTGCCCGTG GTTCTGCTGA ACAAAGGCAC AGATGATGCT 180 

ACAGCTGACA GTCGCAAAAC TTACACTCTA ACTGATTACT TAAAAAATAC TTATAGACTG 240 

80 AAGTTATACT CCTTAAGATG GATTTCAGAT CATGAATATC TCTACAAACA AGAAAATAAT 300 
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ATCTTGGTAT TCAATGCTGA ATATGGAAAC 
GATGAGTTTG GACATTCTAT CAATGATTAT 
TTAGAATACA ACTACGTGAA GCAATGGAGG 
GATTTAAATA AAAGGCAGCT GATTACAGAA 
ACATGGTCAC CAGTGGGTCA TAAATTGGCA 
ATTGAACCAA ATTTACCAAG TTACAGAATC 
AATGGAATAA CTGACTGGGT TTATGAAGAG 
TGGTCTCCAA ACGGCACTTT TTTAGCATAT 
ATTGAATACT CCTTCTACTC TGATGAGTCA 
TATCCAAAGG CAGGAGCTGT GAATCCAACT 
CTCAGCTCAG TCACCAATGC AACTTCCATA 
GGGGATCACT ACTTGTGTGA TGTGACATGG 
CTCAGGAGGA TTCAGAACTA TTCGGTCATG 
AGATGGAACT GCTTAGTGGC ACGGCAACAC 
AGATTTAGGC CTTCAGAACC TCATTTTACC 
AGCAATGAAG AAGGTTACAG ACACATTTGC 
TTTATTACAA AAGGCACCTG GGAAGTCATC 
TACTACATTA GTAATGAATA TAAAGGAATG 
CTTATTGACT ATACAAAAGT GACATGCCTC 
TACTATTCTG TGTCATTCAG TAAAGAGGCG 
GGTCTGCCCC TCTATACTCT ACACAGCAGC 
GACAATTCAG CTTTGGATAA AATGCTGCAG 
TTCATTATTT TGAATGAAAC AAAATTTTGG 
AAATCCAAGA AATATCCTCT ACTATTAGAT 
GACACTGTCT TCAGACTGAA CTGGGCCACT 
GCTAGCTTTG ATGGCAGAGG AAGTGGTTAC 
AGAAGACTGG GAACATTTGA AGTTGAAGAT 
ATGGGATTTG TGGACAACAA ACGAATTGCA 
ACCTCAATGG TCCTGGGATC GGGAAGTGGC 
GTATCCCGGT GGGAGTACTA TGACTCAGTG 
CCAGAAGACA ACCTTGACCA TTACAGAAAT 
AAACAAGTTG AGTACCTCCT TATTCATGGA 
TCAGCTCAGA TCTCCAAAGC CCTGGTCGAT 
ACTGATGAAG ACCATGGA AT AGC TAGCAGC 
AGCCACTTCA TAAAACAATG TTTCTCTTTA 
AAGCTTATTA AAACTCATTT TTGTTTTCAT 
TGATCTTTAA AATACACACT CAAATCAAGA 
ATACCTATCA TCTTAAGTAG GGACTTCTGT 
TTGAATTATC CGGTCGGGTT TTATTGTTTA 
CAAATAGGAA TTGTTTTTAT GGAGGCTTTG 
TTTCTAACTG GACTGGTTCA AATGTTGTTC 
AGTGATGTCA CTAGGGCAGG GACAGGATAA 
TGGCTGGGAA CCCAAGTCCA AGCATACCAA 
AGAAGAGCTG TTCACCACGA GACTGGCACA 
CAGGAAATCA AATATCGAAA GCACTGACTT 
AAAGAAATGT AAGGGAAACT GCCAGCAACG 
TGCTACAAAA ACACAGCAAG GGTGATGGGA 
TACTGATGTT CCTAGTGAAA GAGGCAGCTT 
CTGTTAAAAG ATGAAAATAT TTGTATCACA 
TTTTTCTTAT TTCATTTCTT TGAGTGTCTT 
TCATTTTAAA AAATGGAACA TAAAATACAA 
CTATGGAATT TCTCCCAGTC ATTTAATAAA 



AGCTCAGTTT TCTTGGAGAA CAGTACATTT 360 

TCAATATCTC CTGATGGGCA GTTTATTCTC 420 

CATTCCTACA CAGCTTCATA TGACATTTAT 480 

GAGAGGATTC CAAACAACAC ACAGTGGGTC 540 

TATGTTTGGA ACAATGACAT TTATGTTAAA 600 

ACATGGACGG GGAAAGAAGA TATAATATAT 660 

GAAGTCTTCA GTGCCTACTC TGCTCTGTGG 720 

GCCCAATTTA ACGACACAGA AGTCCCACTT 780 

CTGCAGTACC CAAAGACTGT ACGGGTTCCA 840 

GTAAAGTTCT TTGTTGTAAA TACAGACTCT 900 

CAAATCACTG CTCCTGCTTC TATGTTGATA 960 

GCAACACAAG AAAGAATTTC TTTGCAGTGG 1020 

GATATTTGTG ACTATGATGA ATCCAGTGGA 1080 

ATTGAAATGA GTACTACTGG CTGGGTTGGA 1140 

CTTGATGGTA ATAGCTTCTA CAAGATCATC 1200 

TATTTCCAAA TAGATAAAAA AGACTGCACA 1260 

GGGATAGAAG CTCTAACCAG TGATTATCTA 1320 

CCAGGAGGAA GGAATCTTTA TAAAATCCAA 1380 

AGTTGTGAGC TGAATCCGGA AAGGTGTCAG 1440 

AAGTATTATC AGCTGAGATG TTCCGGTCCT 1500 

GTGAATGATA AAGGGCTGAG AGTCCTGGAA 1560 

AATGTCCAGA TGCCCTCCAA AAAACTGGAC 1620 

TATCAGATGA TCTTGCCTCC TCATTTTGAT 1680 

GTGTATGCAG GCCCATGTAG TCAAAAAGCA 1740 

TACCTTGCAA GCACAGAAAA CATTATAGTA 1800 

CAAGGAGATA AGATCATGCA TGCAATCAAC 1860 

CAAATTGAAG CAGCCAGACA ATTTTCAAAA 1920 

ATTTGGGGCT GGTCATATGG AGGGTACGTA 1980 

GTGTTCAAGT GTGGAATAGC CGTGGCGCCT 2040 

TACACAGAAC GTTACATGGG TCTCCCAACT 2100 

TCAACAGTCA TGAGCAGAGC TGAAAATTTT 2160 

ACAGCAGATG ATAACGTTCA CTTTCAGCAG 2220 

GTTGGAGTGG ATTTCCAGGC AATGTGGTAT 2280 

ACAGCACACC AACATATATA TACCCACATG 2340 

CCTTAGCACC TCAAAATACC ATGCCATTTA 2400 

TATCTCAAAA CTGCACTGTC AAGATGATGA 2460 

AACTTAAGGT TACCTTTGTT CCCAAATTTC 2520 

CTTCACAACA GATTATTACC TTACAGAAGT 2580 

AAATCATTTC TGCATCAGCT GCTGAAACAA 2640 

CATAGATTCC CTGAGCAGGA TTTTAATCTT 2700 

TCTTCTTTAA AGGGATGGCA AGATGTGGGC 2760 

GAGGGATTAG GGAGAGAAGA TAGCAGGGCA 2820 

CACGAGCAGG CTACTGTCAG CTCCCCTCGG 2880 

GTTTTCTGAG AAAGACTATT CAAACAGTCT 2940 

CTAAGTAAAC CACAGCAGTT GAAAGACTCC 3000 

CAGCCCCCAG GTGCCAGTTA TGGCTATAGG 3060 

AAGCATTGTA AATGTGCTTT TAAAAAAAAA 3120 

GAAACTGAGA TGTGAACACA TCAGCTTGCC 3180 

AATCTTAACT TGAAGGAGTC CTTGCATCAA 3240 

AATTAAAAGA ATATTTTAAC TTCCTTGGAC 3300 

TGTTATGTAT TATTATTCCC ATTCTACATA 3360 
TGTGCCTTCA TTTTTTC 



SEP ID NO:224 PEZ3 Protein secuence: 

Protein Accession!: NP..001926.1 

1 11 21 31 41 51 

I I I I I i 

MKTPWKILLG LLGAAALVTI ITVPWLLNK GTDDATADSR KTYTLTDYLK NTYRLKLYSL 60 

RWISDHEYLY KQENNILVFN AEYGNSSVFL ENSTPDEFGH SINDYSISPD GQFILLEYNY 120 

VKQWRHSYTA SYDIYDLNKR QLITEERIPN NTQWVTWSPV GHKLAYVWNN DIYVKIEPNL 180 

PSYRITOTGK EDIIYNGITD WVYEEEVFSA YSALWWSPNG TFLAYAQFTJD TEVPLIEYSP 240 

YSDESLQYPK TVRVPYPKAG AVNPTVKFFV VNTDSLSSVT NAT SI QI TAP ASML1GDHYL 300 

CDVTWATQER ISLQWLRRIQ NYSVMDICDY DESSGRWNCL VARQHIEMST TGWVGRFRPS 360 

EPKFTLDGNS FYKIISNEEG YRHICYFQID KKDCTFITKG TWEVIGIEAL TSDYLYYISN 420 

BYKGMPGGRN LYKIQLIDYT KVTCLSCELN PERCQYYSVS FSKEAKYYQL RCSGPGLPLY 480 

TLHSSVNDKG LRVLEDNSAL DKMLQNVQMP SKKLDFIILN ETKEWYQMIL PPHFDKSKKY 540 

PLLLDVYAGP CSQKADTVFR LNWATYLAST ENIIVASFDG RGSGYQGDKI MHAINRRLGT 600 

PEVEDQIEAA RQFSKMGFVD NKRIAIWGWS YGGYVTSHVL GSGSGVFKCG IAVAPVSRWE 660 

YYDSVYTERY KGLPTPEDNL DHYRNSTVMS RAENFKQVEY LLIHGTADDN VHFQQSAQIS 720 
KALVDVGVOF QAKWYTDEDH GIASSTAHQH IYTHMSHFIK QCFSLP 

SEO ID NO:225 PBJ2 0NA SEQUENCE 

Kudefc Acid Accession f: none found 

Codno sequence: 1-26t (undenTned sequences correspond to start and stop codorts) 
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ATGGCTCTGG CGAAGGTGAG GGAGCCAAAC GCAAATGACA ATGCCATCAG AGTTGACAAC 60 

AGAAGTGTGA TTAAAGTGCG TGCTAACCAG TGTTCCCTGC ATGAGGCAGA AAGTGAATCC 120 

AGAAACCCTC AGGAGCTCTG GATGGGCCTG C T CCTCTTGA TGGGGGTCCT AGAAGCATGT 180 

GTGGAAATGA GGCCTCTGTC AGTCTGGTCC CTGAGAGATG ACAAGGAGCA GAGCCCCCAC 240 
CAGCCCACAC TGGATGTCTA A 



SEQ ID NQ;226 ppjg PfQ^n SSfflffiQgg 

Protein Accession *: none found 

1 11 21 31 41 51 

111(11 

MALAKVREPN ANDNAIRVDN RSVIKVRANQ CSLHEAESES RNPQELWKGL LLLKGVLEAC 60 
VEMRPLSVWS LRDDKEQSPH QPTLDV 

SEQ ID N0:227 PBM2 DNA SEQUENCE 

Nucleic Add Accession #: none found 

Coding sequence: 1-462 (underlined sequences corns pond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGCCAAATG CTGAGTTAGA AGCAAAGAGC CTTGGAAGCA GTAAATGTTT AAAAACTGCT 60 

CTCATACTTG CTGTATGTTG TGGATCAGCA AATATAGTCA GCCCTCTACT TGAGCAAAAT 120 

ATTGATGTAT CTTCTCAAGA TCTGGACAGA CGGCCAGAGA GTATGCTGTT TCTAGTCATC 180 

ATCATGTGGA CCAGTTTTGT GGAAGACAAT CTTTCCATGG GCTGGGGGAA GCTAGAAGAT 240 

TTTATGGCTA TTGAAGAAGA AATGAAGAAG CACGGAAGTA CTCATGTGGG ATTCCCAGAA 300 

AACCTGACTA ATGGTGCCGC TGCTGGCAAT GGTGATGATG GATTAATTCC TCCAAGGAAG 360 

AGCAGAACAC CTGAAAGCCA GCAATTTCCT GACACTGAGA ATGAAGAGTA TCACAGGTTT 420 
GTCAAAGATC . AGATAGTTGT AGATATGCGG CGTTATTTC T GA 

SEQ ID N0228PBM2 Protem sequence: 

Protein Accession «: none found 



1 11 21 31 41 51 

I I I I 1 t 

MPNAELEAKS LGSSKCLKTA LILAVCCGSA NIVSPLLEQN IDVSSQDLDR RPESMLFLVI 60 
IMWTSFVEDN LSMGWGKLED FMAIEEEKKK HGSTHVGFPE NLTOGAAAGN GDDGLIPPRK 120 
40 SRTPESQQFP DTENEEYHRF VKDQtVVDMR RYF 



SEQ ID N0:229 PEZ2 DNA SEQUENCE 

Nucleic Acid Accession S: NM.014253 

Coding sequence: 65-0242 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

111 ) 11 

GACTGCTTGC ATTAAAGGAC TTCCTCATCC TTTTTTTCAT GAAACTGAGC TTGCTTAATC 60 

50 AGAGATGGAG CAAACTGACT GCAAACCCTA CCAGCCTCTA CCAAAAGTCA AGCATGAAAT 120 

GGATCTAGCT TACACCAGTT CTTCTGATGA GAGTGAAGAT GGAAGAAAAC CAAGACAGTC 180 

ATACAACTCC AGGGAGACCC TGCACGAGTA TAACCAGGAG CTGAGGATGA ATTACAATAG 240 

CCAGAGTAGA AAGAGGAAAG AAGTAGAAAA ATCTACTCAA GAGATGGAAT TCTGTGAAAC 300 

cc CTCTCACACT CTGTGCTCTG GCTACCAAAC AGACATGCAC AGCGTTTCTC GGCATGGCTA 360 

55 CCAGCTAGAG ATGGGATCTG ATGTGGACAC AGAGACAGAA GGTGCTGCCT CACCTGACCA 420 

TGCACTAAGA ATGTGGATAA GGGGAATGAA ATCAGAGCAT AGTTCCTGTT TGTCCAGCCG 480 

GGCCAACTCT GCATTATCCT TGACTGACAC TGACCATGAA AGGAAGTCTG ATGGGGAAAA 540 

TGGTTTCAAA TTCTCTCCTG TTTGTTGTGA CATGGAGGCT CAAGCTGGGT CTACTCAAGA 600 

^_ TGTGCAGAGC AGCCCACACA ACCAGTTCAC CTTCAGACCC CTCCCACCGC CACCTCCGCC 660 

OU TCCTCATGCC TGCACCTGTG CCAGGAAGCC ACCCCCTGCA GCGGACTCTC TTCAGAGGAG 720 

ATCAATGACT ACCCGCAGCC AGCCCAGCCC AGCTGCTCCA GCTCCCCCAA CCAGCACGCA 780 

GGATTCAGTC CATCTGCATA ACAGCTGGGT CCTGAACAGC AACATACCAT TGGAGACCAG 840 

GCATTCCCTG TTCAAACATG GATCTGGTTC CTCTGCGATC TTCAGTGCAG CCAGTCAGAA 900 

_ _. CTACCCTCTG ACATCCAATA CCGTGTACTC GCCCCCTCCC AGGCCTCTTC CTCGAAGCAC 960 

65 CTTTTCCCGA CCTGCCTTTA CCTTTAACAA ACCTTACAGG TGCTGCAACT GGAAGTGCAC 1020 

AGCATTGAGC GCCACTGCAA TCACAGTGAC TTTGGCCTTG TTACTAGCCT ATGTGATTGC 1080 

AGTGCATTTG TTCGGCCTGA CTTGGCAGTT GCAACCAGTT GAAGGAGAGC TGTATGCAAA 1140 

TGGAGTTAGC AAAGGGAACA GGGGGACCGA GTCCATGGAC ACTACTTACT CTCCAATTGG 1200 

AGGAAAAGTT TCTGATAAAT CAGAGAAAAA AGTGTTTCAG AAGGGACGGG CGATAGACAC 1260 

70 TGGAGAAGTT GACATTGGTG CACAGGTCAT GCAGACCATT CCACCTGGTT TATTCTGGCG 1320 

TTTCCAGATT ACTATCCACC ATCCAATATA TCTGAAGTTC AATATTTCTT TAGCCAAGGA 1380 

CTCTCTGCTG GGAATTTATG GCAGAAGAAA CATICCACCT ACACATACTC AGTTTGATTT 1440 

TGTAAAACTA ATGGATGGCA AACAGCTGGT CAAGCAGGAC TCCAAGGGCT CTGATGATAC 1500 

ACAGCACTCC CCTCGGAACC TGATCTTAAC TTCGCTTCAG GAGACAGGTT TCATAGAGTA 1560 

75 TATGGATCAA GGACCTTGGT ATCTGGCGTT TTAC AATGAT GGAAAAAAGA TGGAGCAAGT 1620 

ATTCGTGTTA ACTACAGCAA TTGAAATAAT GGATGACTGT TCAACCAATT GCAATGGAAA 1680 

TGGAGAGTGT ATCTCTGGCC ATTGTCATTG TTTCCCAGGA TTCCTTGGAC CTGACTGTGC 1740 

TAGAGATTCC TCCCCTGTGC TGTGTGGTGG GAATGGAGAA TACGAGAAAG GACACTGTGT 1800 

CTGCCGGCAT GGCTGGAAGG GGCCAGAGTG TGACGTTCCG GAAGAACAAT GCATTGATCC 1860 

80 AACATGCTTT GGCCACGGCA CCTGCATCAT GGGAGTCTGC ATCTGTGTGC CAGGATACAA 1920 
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AGGAGAAATA TGCGAGGAAG AGGACTGCCT AGACCCAATG TGTTCCAACC ATGGCATCTG 1980 

TGTAAAAGGA GAATGTCACT GTTCTACTGG CTGGGGAGGA GTTAACTGTG AAACACCACT 2040 

TCCTGTATGT. CAAGAGCAGT GCTCAGGACA CGGAACTTTT CTTCTGGACG CTGGAGTATC 2100 

CAGCTGTGAT CCCAAGTGGA CAGGATCTGA CTGCTCAACA GAGCTGTGTA CCATGGAGTG 2160 

TGGTAGCCAT GGAGTCTGCT CAAGAGGAAT TTGCCAGTGT GAAGAAGGCT GGGTAGGACC 2220 

AACATGTGAG GAACGCTCCT GTCATTCTCA TTGTACTGAG CATGGCCAAT GCAAAGATGG 2280 

AAAATGTGAG TGTAGCCCTG GATGGGAGGG CGACCACTGC ACAATTGCTC ACTACTTAGA 2340 

TGCTGTCCGA GATGGCTGCC CAGGGCTCTG CTTTGGAAAT GGACGATGTA CCCTGGATCA 2400 

AAATGGTTGG CACTGTGTGT GTCAGGTGGG TTGGAGTGGG ACAGGCTGCA ATGTTGTCAT 2460 

GGAAATGCTT TGTGGAGATA ACTTCGACAA TGATGGAGAT GGTTTAACCG ACTGTGTGGA 2520 

TCCTGACTGT TGTCAACAAA GCAACTGTTA TATAAGTCCT CTCTGCCAGG GCTCACCAGA 2580 

TCCTCTTGAC CTCATTCAGC AAAGCCAAAC TCTCTTCTCT CAGCACACTT CAAGACTT1T 2640 

TTATGATCGA ATCAAATTCC TCATTGGCAA GGACAGTACT CATGTCATTC CTCCTGAGGT 2700 

GTCATTTGAC AGCAGGCGTG CCTGTGTGAT TCGAGGCCAA GTGGTGGCCA TAGATGGAAC 2760 

TCCTCTAGTG GGAGTGAATG TCAGTTTCTT GCACCACAGT GATTATGGGT TTACCATCAG 2820 

CCGGCAAGAT GGAAGCTTTG ACCTCGTGGC CATCGGTGGC ATCTCTGTCA TCTTAATCTT 2880 

CGACCGATCC CCTTTCCTGC CTGAGAAGAG AACACTCTGG TTGCCTTGGA ATCAGTTTAT 2940 

TGTGGTAGAG AAAGTCACCA TGCAGAGAGT TGTATCAGAC CCGCCATCCT GCGATATCTC 3000 

CAACTTTATC AGCCCAAACC CTATTGTGCT TCCTTCACCG CTCACATCAT TTGGAGGGTC 3060 

CTGTCCAGAG AGGGGAACTA TTGTTCCTGA GCTGCAGGTT GTACAGGAGG AAATTCCCAT 3120 

TCCCTCCAGC TTTGTGAGGC TGAGTTACCT GAGCAGCCGC ACCCCTGGGT ATAAAACCCT 3180 

GCTACGGATC CTTCTGACAC ATTCAACGAT TCCCGTAGGC ATGATAAAAG TACACCTCAC 3240 

AGTAGCTGTG GAAGGGCGAC TCACACAGAA GTGGTTTCCC GCCGCAATTA ATCTTGTCTA 3300 

CACATTTGCT TGGAACAAGA CCGATATCTA TGGACAGAAG GTTTGGGGCC TGGCAGAGGC 3360 

TTTGGTATCT GTGGGATATG AATATGAAAC GTGCCCTGAC TTTATTCTCT GGGAGCAAAG 3420 

GACAGTCGTT TTACAAGGTT TTGAGATGGA TGCTTCTAAC CTAGGAGACT GGTCTTTGAA 3480 

TAAGCATCAC ATTTTGAATC CTCAAAGTGG AATCATACAT AAAGGGAATG GAGAAAATAT 3540 

GTTCATTTCC CAGCAGCCCC CAGTCATATC AACCATAATG GGTAATGGAC ACCAAAGGAG 3600 

TGTAGCCTGC ACCAACTGCA ATGGCCCAGC CCACAACAAC AAACTCTTTG CTCCTGTCGC 3660 

CTTAGCTTCT GGCCCTGATG GCAGTGTGTA TGTTGGCGAC TTCAATTTTG TAAGGAGAAT 3720 

ATTTCCCTCG GGAAACTCCG TTAGTATTTT GGAATTAAGC ACAAGTCCTG CTCACAAATA 3780 

CTATCTGGCT ATGGACCCTG TGTCTGAATC ACTCTATCTA TCAGACACCA ATACTCGCAA 3840 

AGTCTACAAG TTGAAATCTC TTGTGGAGAC GAAAGATCTG TCCAAGAATT TTGAAGTGGT 3900 

GGCAGGAACT GGTGATCAGT GCCTTCCCTT TGACCAGAGT CATTGTGGAG ATGGTGGGAG 3960 

AGCATCGGAA GCTTCACTGA ATAGCCCTCG AGGCATCACA GTTGATAGGC ATGGATTTAT 4020 

TTACTTTGTG GATGGGACTA TGATTCGCAA AATTGATGAG AATGCTGTGA TCACAACTGT 4080 

AATCGGCTCA AATGGTCTGA CTTCCACACA ACCACTGAGC TGTGACTCAG GAATGGACAT 4140 

CACTCAGGTG CGATTAGAGT GGCCAACAGA CCTTGCAGTA AATCCTATGG ACAATTCATT 4200 

GTATGTCTTG GATAACAACA TTGTGCTGCA AATTTCTGAG AACAGGCGTG TTCGGATCAT 4260 

CGCAGGACGC CCCATTCACT GCCAGGTGCC AGGCATCGAT CATTTCCTGG TCAGCAAGGT 4320 

AGCAATTCAC TCCACTCTAG AGTCAGCGAG GGCCATCAGT GTCTCCCACA GCGGGCTGCT 4380 

CTTCATAGCT GAAACAGACG AGAGGAAAGT AAACCGCATT CAGCAAGTAA CCACCAATGG 4440 

GGAGATCTAC ATCATCGCTG GTGCCCCCAC TGACTGTGAC TGCAAAATTG ATCCAAACTG 4500 

TGACTGTTTT TCAGGTGATG GTGGCTATGC CAAAGATGCA AAGATGAAAG CCCCTTCCTC 4560 

CTTAGCAGTG TCGCCTGATG GAACCCTCTA TGTGGCAGAC CTCGGAAATG TTCGAATTCG 4620 

TACCATCAGC AGGAACCAAG CCCACCTCAA TGACATGAAC ATTTATGAGA TTGCTTCACC 4680 

CGCTGATCAG GAACTGTACC AGTTCACTGT AAATGGAACC CACCTACACA CCCTGAACTT 4740 

GATAACAAGG GACTATGTTT ATAACTTCAC CTACAATTCT GAAGGTGACT TGGGCGCGAT 4800 

TACCAGCAGC AATGGCAATT CAGTGCACAT TCGCCGTGAT GCAGGCGGAA TGCCGCTATG 4860 

GCTTGTGGTG CCTGGCGGAC AAGTATACTG GCTGACTATA AGCAGCAATG GAGTCCTGAA 4920 

AAGAGTGTCA GCCCAAGGCT ATAATCCGGC CTTAATGACC TATCCAGGAA ACACAGGGCT 4980 

TCTGGCTACC AAAAGTAACG AAAATGGATG GACAACCGTP TATGAGTATG ACCCCGAGGG 5040 

ACACCTGACC AATGCAACGT TTCCCACTGG AGAGGTCAGC AGCTTCCACA GTGACCTGGA 5100 

GAAGCTGACA AAAGTGGAGC TAGATACTTC CAACCGTGAA AATGTCCTCA TGTCAACCAA 5160 

CTTGACGGCA ACTAGTACCA TATATATTTT AAAACAAGAA AATACTCAAA GTACCTATCG 5220 

GGTGAATCCA GATGGTTCCC TGCGTGTCAC TTTTGCCAGC GGGATGGAGA TCGGCCTCAG 5280 

CTCAGAGCCC CACATCCTGG CAGGGGCAGT CAACCCTACC. CTGGGCAAAT GCAACATCTC- 5340 

ATTGCCCGGA GAGCACAATG CAAACCTCAT CGAGTGGCGG CAGAGGAAGG AGCAAAACAA 5400 

AGGCAATGTT TCGGCTTTTG AAAGGAGGCT GAGGGCCCAC AACAGAAACC TACTCTCCAT 5460 

AGATTTTGAT CATATAACCC GCACAGGAAA GATCTATGAT GACCATCGAA AATTCACCCT 5520 

TCGAATTCTT TATGACCAGA CTGGGCGACC CATTCTGTGG TCTCCTGTAA GCAGATATAA 5580 

TGAAGTGAAC ATCACATATT CACCTTCGGG ATTGGTGACG TTPATTCAAA GAGGAACGTG 5640 

GAATGAAAAA ATGGAATATG ACCAGAGTGG GAAAATTATT TCAAGAACTT GGGCTGATGG 5700 

GAAAATTTGG AGCTATACCT ACTTAGAAAA ATCTGTGATG CTTCTCCTAC ACAGCCAGCG 5760 

GCGTTACATC TTTGAGTATG ACCAATCAGA TTGCCTGCTG TCAGTTACCA TGCCTAGCAT 5820 

GGTGCGCCAC AGCTTACAAA CCATGCTTTC AGTGGGCTAC TACCGTAATA TCTACACCCC 5880 

ACCGGACAGT AGCACTTCTT TTATCCAAGA CTATAGTCGA GATGGCCGAT TGCTACAGAC 5940 

CCTGCATCTG GGGACAGGGC GCAGAGTCTT ATACAAGTAC ACCAAGCAAG CAAGGCTTTC 6000 

TGAGGTTCTC TATGATACCA CTCAGGTCAC ATTAACATAT GAAGAGTCTT CTGGAGTGAT 6060 

TAAGACAATA CACCTGATGC ATGACGGATT CATCTGCACA ATCAGATACA GGCAAACAGG 6120 

ACCTCTTATT GGACGCCAGA TTTTCAGATT CAGTGAAGAA GGCCTTGTGA ATGCACGGTT 6180 

CGACTACAGC TACAACAATT TCCGAGTCAC AAGCATGCAA GCTGTAATCA ATGAAACCCC 6240 

TTTGCCTATA GATCTTTACC GATATGTTGA TGTCTCTGGC AGAACAGAGC AGTTTGGAAA 6300 

ATTCAGTGTA ATTAATTACG ATTTAAATCA GGTCATAACT ACTACAGTGA TGAAACACAC 6360 

CAAAATCTTC AGTGCCAATG GACAAGTCAT TGAAGTCCAA TATGAAATCC TAAAGGCAAT 6420 

TGCCTACTGG ATGACCATTC AATATGATAA TGTGGGCCGA CATGGTAATA TGTGCATAAG 6480 

GGTAGGAGTA GATGCCAATA TAACAAGGTA CTTCTATGAA TACGATGCTG ATGGGCAACT 6540 

TCAGACTGTT TCTGTAAATG ACAAAACCCA GTGGCGTTAT AGTTACGATC TGAATGGAGA 6600 

CATCAACCTC TTAAGCCATG GGAAGAGTGC TCGTCTTACT CCTCTCCGAT ATGACCTCCG 6660 

AGACCGCATC ACCAGATTAG GAGAAATTCA GTATAAAATG GATGAAGATG GCTTTCTGAG 6720 
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GCAGAGGGGA AATGATATTT TTGAATATAA TTCTAATGGC CTGCTGCAGA AAGCCTACAA 6780 
TAAGGCTTCT GGCTGGACTG TGCAGTATTA CTATGATGGG CTTGGGCGAC GTGTCGCGAG 6840 
TAAGTCCAGC CTAGGGCAGC ACCTTCAGTT CTTTGTCGAC GCGACCGCGA ACCCCATAAG 6900 
AGTTACTCAT TTGTACAACC ACACAAGCTC GGAGATTACA TCTCTGTATT ATGATCTCCA 6960 
5 AGGTCACCTT ATTGCCATGG AGTTAAGCAG TGGTGAAGAA TATTATGTAG CCTGTGATAA 7020 
TACAGGTACC CCACTAGCTG TGTTCAGCAG CCGAGGTCAG GTCATAAAGG AGATACTATA 7080 
CACACCTTAT GGCGATATCT ATCATGACAC TTACCCTGAC TTTCAGGTCA TAATTGGTTT 7140 
TCATGGAGGA CTCTATGATT TCCTTACTAA ATTAGTGCAC CTGGGGCAAA GGGATTATGA 7200 
TGTTGTTGCT GGCAGATGGA CAACGGCCTA TCATCACATA TGGAAACAGT TGAACCTCCT 7260 
10 TCCTAAACCA TTCAACCTCT ACTCCTTTGA AAATAACTAC CCAGTTGGCA AAATTCAAGA 7320 
TGTTGCAAAG TATACCACAG ACATCAGAAG TTGGTTGGAG CTATTTGGTT TCCAATTACA 7380 
CAATGTACTA CCTGGATTTC CCAAACCTGA ATTAGAAAAT TTAGAATTAA CTTACGAGCT 7440 
TCTACGGCTT CAGACAAAAA CTCAAGAGTG GGATCCTGGA AAGACTATCC TGGGCATTCA 7500 
GTGTGAACTC CAGAAACAGC TCAGGAATTT CATTTCCTTG GACCAACTAC CTATGACTCC 7560 
15 CCGATACAAT GATGGACGGT GCCTTGAAGG AGGGAAGCAA CCAAGGTTTG CTGCTGTCCC 7620 
TTCTGTTTTT GGGAAAGGTA TAAAATTTGC CATCAAGGAT GGCATAGTAA CAGCTGATAT 7680 
TATAGGAGTA GCCAATGAAG ATAGCAGGCG GCTTGCTGCC ATTCTCAATA ATGCCCATTA 7740 
CCTGGAAAAC CTACATTTTA CCATAGAGGG GAGGGACACT CACTACTTCA TTAAGCTTGG 7800 
GTCTCTGGAG GAAGACCTGG TGCTCATCGG TAACACTGGG GGGAGGCGGA TTCTGGAGAA 7860 
20 TGGTGTCAAT GTCACTGTGT CCCAGATGAC TTCTCTGTTG AATGGGAGGA CTAGACGGTT 7920 
TGCAGATATT CAGCTCCAGC ATGGAGCCCT GTGCTTCAAC ATCCGGTATG GGACAACTGT 7980 
CGAAGAGGAA AAGAATCACG TGTTGGAGAT TGCCAGACAG CGCGCAGTGG CCCAGGCCTG 8040 
GACTAAGGAA CAAAGAAGGC TGCAAGAGGG GGAAGAGGGG ATTAGGGCAT GGACAGAAGG 8100 
GGAAAAGCAG CAGCTTTTGA GCACTGGGCG GGTACAAGGT TACGATGGGT ATTTTGTTTT 8160 
25 GTCTGTTGAG CAGTATTTAG AACTTTCTGA CAGTGCCAAT AATATTCACT TTATGAGACA 8220 
GAGCGAAATA GGCAGGAG GT AA CAAAAATA TCTCTGCCTT TGCGTCACCA AAGACTGCCT 8280 
GTTTTTAAAA CATAAAATGG TTTATTGTAT TGGTTTTCTA GATCAGAACT CTGTATATGT 8340 
AAATATCGAG GAAAAACATA TCCAACTGCC TTTCAATGTG ACGGAAGATG GTATTTTAAT 8400 
ATTGTTTCTT TAAACTCTTT AAGAAATGAC AGAGATTTTT AGTTCTTGTG TGGCAGTATT 8460 
30 CAAAATAACA CAAGTAGAAC TCAAACAGCT AAAAACAGTT TTCAGAAAGC ACCACTTTCA 8520 
ATTTGCCGAG CCATGCATAT GTTCCAATAT CCAGAAAGAA CCCAAGGTTC TCTATCTCTA 8580 
TTGTGAGAAG CAGTTTCATC CTTAACTGTT GGCAGAACTT ACGGGCTATT TGAATAGGTG 8640 
GTGCAATAGT ATCTGAAACT TGCCTTTCGA AAGACTGCCA GCCCTTTGAC GTTTTCCAGA 8700 
TCTGTTATAG GAAACTTAAA AACAGGTGTA AAATGTCTTC AGCCACCATC TCCTAGAGTG 8760 
AGGACCCAAT TGCCCTTCCT TCTTGATTAT TCCTCCTTGC TTGTTAAAGT AAATGCCATA 8820 
TTGTTGTGCT GTGTTTTGGC GTGTGGTGGC TGGGTTCTGT CTACCATGCT TCCCTGTGGG 8880 
TGTGGTAACC AGACTGTATA GCCGCTATTT GCTCGTGTGT ACATGATACC AAAGCAGCTG 8940 
GCCAGCGTGA CCTCTCTCAC ACGACCTGTT TTGACTCAAT TTTTTACTAA AAGTTGTTCA 9000 
GCTGTATTGG TATCATGTAA ACATAGCTTT TATTAACCTG GGTAGGAATT TCTCATTTAT 9060 
ATATAGGATG TGTTTTGGTC ATAGTTTCAC ATTAGTGATT CAGTATCTAT ACACTGACCC 9120 
AATGGTTTTG TGCACATGAA CGGTAATTTA CTTAAAAGTA TGATTCTGGT ACAAAAACAA 9180 
ACAAAGGCTT TAGCAGGCAT ACGTGTCTGG GATGCCGATA CATACATTAA CTACTACTGC 9240 
AGAAATTCAT AAGAGCCAAA ACCTTAAAAA AATAGACCTG GTACTTAAGT GAAAGTACTA 9300 
AAGGGAAGAC CAGACCAAAC ATCACAGCAG TTGCTGCCAC ATTGTTTCAG CCCACTTAGA 9360 
TTTATCTTTC AAATGTACAA TTCTGTATTG AACATCTCCC AGCCATCTTC AGGAAATCGA 9420 
ATCAAGTAAA TCCTTTCCAA CCGAAAACAT TTCAACTAAC TATAGAGAGG CAGACTCATT 9480 
TTTACTAAAA TAATTTATAC AGTTAGTTAT TTTCGTTCTC CGTACTTACC CATTTATCTT 9540 
TATTTAATCG TCTCTACTGC CTAGGAAAAT AACTATTTTC CAGGACGGGT TATTTGTTCT 9600 
GCGATCATTT AAAATTTGGA GAAAGGTCAG GATTAGTGTT AATATCAGCT GCAGTTTCTC 9660 
AATCTCTAGG AATCCTGCAG TAAAACAAGC CCCTTGGTGA GCTGGAAGAT TTGTGCCCAG 9720 
TGACAAAGAG ATAGTTTGTA AAATGCTCTG TAATTGTAAG TTACCACAAA TGAAAATACA 9780 
TGACAGCACA ATGTGGCCCG TAGAAAATTC CCCTGAGCCA GCTTCTGCAC TTTCATCACC 9840 
GAATCTGAAC ATTTGCTATG TCTGAAGGCA AATTTATGAT GGAATGTTAG TTTGGATTCT 9900 
TTCCAGATGC TACCTAAATG CAGTGTGGGG TCATTGCCTT GCTTTGCGAT GACAGTTTCT 9960 
TTGAAAATAT GC AAAGTCAT AAGCTCATGT TAAGGTTTTT CAAGAGTCTG CCTCCTACTA 10020 
CACAAAGGAA AGCAAGGGAA AGGAAATGAC CCTGGCAAAC AGTAGGGAAG GGTGTATTCA 10080 
AACATTTCAT TTTCAAAACC TTCGGGTTAG AATACCACTT ACACATGTAT TCTGAGAGAC 10140 
AGAATTCATG AGGAACTCAT CTCTCTTTAT AACTGGAAAC ACACCAGCTT GATATATTGC 10200 
^ _ TAATCCATAC TAAAATCATA TTATTGGGTT TTTTCTGAAT CAGGCCTGTA TTAATGGTAC 10260 
OO AGTATTTATT CAGAATGGAA TTCTAAAATT ACTAACAAAC TTGTTGAAAA TTTGAATACC 10320 
TCCACACCAA CCTAAAAATG GACCTTAAGT TCCTAGAACC TCTGATGTTC TTTTAAATTA 10380 
ATGGAAAAAT AATTTGTGAA CTGTATATAG AGAGTGCATT CATAAATGTG ATTATGTATT 10440 
TTATCACAAA TCCAAAATGT CAATATTAGA GTCTATTTTG CTTATATTTT AAGCAATTAT 10500 
ACGTTTTTGC AATTCATTGA TGATGTATCA TTTTCAAACT GCTTTAAATA TCCATTAGAA 10560 
65 ACAAATATTT GAAGCTTTTA CTTAATAGTG ATTACCTTGA ACTGTGCATT TCTAGTTTGT 10620 
AATACGTATT TGGTTGGTTC GTCCCTTTAG TTTGTTAAAG TTACATTTGT ATTATATTCA 10680 
GGAAATGCAC TTTTTATTAC TTACAGCTGT GGTTTTAATA CTGCCTTGAA CTATTATTAT 10740 
TCTTTTTACA ACTCCTAAAG CTTGAGGGAG GAAAGAAAAA AAAAACAAAA CTACTAATCA 10800 
GTAGTAAATC GAAGAGAAAC ATTTTGGCAT TTCTTAAGAA GAAGATGGAG ATATTGAGTA 10860 
70 TATCACTTCC TATTCAGCTG AATAGAAAGA ATGCCTTCAT TGACTTGCAG TTCTGCAGTT 10920 
TAAATTATTG AAAGAACAAT TCGTTTGCAT TTCCTGATGA AAGTAAAAGC ATTTTTCAGA 10980 
GAAACATATG AATTTCTCAT ACCCAGCAGA CAGATGGCTG ACACTGCACA GCCACACACC 11040 
ATTCGAGTAA GTTAAAGTGA GAGCATAGTA GTTGGACTCT CCTATGAAGA ACATTCTGGG 11100 
CTGGAGGCAG GGAATACTCC ATGGTTGTTT CTTTTTCCTA CTTAAGCCCA TTTTGTTTGT 11160 
75 GCTTTTCTGT TTTGTTTTGT TTTCACTCTT GCACTACAGT CTAGAGATCC AAATGAACTG 11220 
AAAAGTTCAA AGTTTAACAC ATTTAAATAT GTTTACTTTT AGTTGTCATT CTAATCGTTA 11280 
TTGATTAGAA GCATGACTCC TGAAGGAAAG GGAAATAAAT CTCAATTCAT ACTAACTTGC 11340 
AACAAAACAC TTTTACCATA TAAATAAGTA TATGATTTAT TTTTAACCCA AAAAATGTAT 11400 
AAAATAAGTG TGTCCTTTAC TGTCAATTTA TCGAGAAGAT CTATAATATA TAGACTACAT 11460 
80 ATATATAATA TATACAACAT AGCCAAATGT ATGAAAACTT GACAATGTAT AATTTGGAAT 11520 
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TGACATGCTA CCTATGTAGA CAGGTATGAA ATTAAGTTAT AATTTTCATG AGACATTTTC 11580 
ATCACTGTTG ACACAGTTTC AAGGCATTCC ATCATGTTAT TTTGACTCTT TTTCTTTTTT 11640 
TTTTCTTTAA AAATATATTT TTAACTAGAC CAGGCCCCAC TATAATATCA CTTAAGAGAG 11700 
TCAGGGCAAA GTTTTTGCAT TTATGAAGAT GTGTTCATGT AAGGGTGATT GTAATGGAGT 11760 
TCATTGGTAA TAGAAGCAAA AGTACAGTAA CGAAGTATTG AAAAGAAAAT TTTGGAGACA 11820 
TTGGAGCATA TTATATATAG CTTGTGGAAA GACATAAGGC TACAGATGGA ATGGAACATT 11880 
CCTGTTTTCT TGAAGAAATT CACATACACA TAGCTGACCT GACTAGTACT TCAGCTCTTC 11940 
CACAGCCTTC TATAAAGGTT CTTTCTTCTG CAAAGAAAAC AAAACAAAAC AAAACAAAAC 12000 
AAAAAAAAAC AAAAAAAGCG CAAAAAACAA AAAAACAAAA AAAAGCAAAG TAAAATTTAA 12060 
AAATACAGAA AACAAACAAC AAAAAAGAAT TCAACCATAA ATAGTGACTA TTATTTTCAG 12120 
TGTGTCCTTC ATGTGAAAGC TATTAAGGAC CAAATATACT ACTGTTCATA AGAAGAAATT 12180 
ACTTTCTAAA CAGTAACTGA AAATACTTAG AGTTAAACTT GCTGTGGATT TTGTCT7GGC 12240 
AGTTGTCATC TTACATTATT TGTCAAAGGA AATGTGTTTG GCAGTTAAAA ATCTTTCCTT 12300 
AGATTTAGTG GTGGACTTTA ACCTCTTAAA TAAATGTTAG TATATCAGAT TGTGTCCTTG 12360 
AAAAATATTT TACTTGTATG AATCATGACA ACGTCTAAAT CTTTACTATT CTTCTGGCAA 12420 
AAGCATCAGT AAGAAAGAAG GCGAAAAAGA GAAGTATAGC CTTTATGTCA GAAAAACATT 12480 
CTTTTTAGCT GCTTACTTTC TCATGAAAAG TAAAGATGTT TACAGTGTAT GCCAAGTTTT 12540 
CAGTTTCTGT ATAACAACAG GTAGAGGTTC TAATCATATT GAAAATTGTG TTATAATGGT 12600 
CTGAGCCATG TTGCTAGGAA ACAATAGGTT CCAATTTTGT ATTCCTGCTC TCCTGTGCTG 12660 
AAAAGTGACT GGATACTGTA CAGGTTCATG TTCTCTGGCT GCAGTTAAAT GGTCTTTTGC 12720 
ATTTTGCTCT GGCTTTCAGG CCAGAAGCAT GCATTTTTCT ACAAGAGCAT CACAACAACA 12780 
TGCTGTAAAT ATTTAAAGTT AAACATTATG TGTTGATATT TGAAAGAAAA GTACTTTGAA 12840 
TATTTCATTT TTAAAAAATA AAATTGCCAA TGAAAAAAAA 
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I I I I t I 

MEQTDCKPYQ PLPKVKHEMD LAYTSSSDES EDGRKPRQSY NSRETLHEYN QELRMNYNSQ 60 

SRKRKEVEKS TQEMEFCETS HTLCSGYQTD MHSVSRHGYQ LEMGSDVDTE TEGAASPDHA 120 

LRHWIRGMKS EHSSCLSSRA NSALSLTDTD HERKSDGENG FKFSFVCCDM EAQAGSTQDV 180 

QSSPHNQFTP RPLPPPPPPP HACTCARKPP PAADSLQRRS MTTRSQPSPA APAPPTSTQD 240 

SVHLHNSWVL NSNIPLETRH SLFKHGSGSS AIFSAASQNY PLTSNTVYSP PPRPLPRSTP 300 

SRPAFTFNKP YRCCNWKCTA LSATAITVTL ALLLAYVIAV HLFGLTWQLQ PVEGELYANG 360 

VSKGNRGTES MDTTYSPI GG KVSDKSEKKV FQKGRAIDTG EVDIGAQVMQ TIPPGLFWRF 420 

Q1TIHHPIYL KFNISLAKDS LLGIYGRRNI PPTHTQPDFV KLMDGKQLVK QDSKGSDDTQ 480 

HSPRNLILTS LQETGFIEJYM DQGPWYLAFY NDGKKMEQVF VLTTAIEIMD DCSTNCNGNG 540 

ECISGHCHCP PGFLGPDCAR DSCPVLCGGN GEYEKGHCVC RHGWKGPECD VPEEQCIDPT 600 

CFGHGTCIMG VCICVPGYKG EICEEEDCLD PHCSNHGICV KGECHCSTGW GGVNCETPLP 660 

VCQEQCSGHG TFLJ^DAGVCS CDPKWTGSDC STELCTMECG SHGVCSRGIC QCEEGWVGPT 720 

CEERSCHSHC TEHGQCKDGK CECSPGWEGD HCTIAHYLDA VRDGCPGLCF GNGRCTLDQN 780 

GWHCVCQVGW SGTGCNWME MLCGDNLDND GDGLTDCVDP DCCQQSNCYI SPLCQGSPDP 840 

LDLIQQSQTL FSQHTSRLFY DRIKFLIGKD STHVIPPEVS FDSRRACVIR GQWAIDGTP 900 

LVGVNVSFLH HSDYGFTISR QDGSFDLVAI GGISVILIFD RSPFLPEKRT LWLPWNQFIV 960 

VEKVTHQRW SDPPSCDISN FISPNPIVLP SPLTSFGGSC PERGTIVPEL QWQEEIPIP 1020 

SSFVRLSYLS SRTPGYKTLL RILLTHSTIP VGMIKVHLTV AVEGRLTQKW FPAAINLVYT 1080 

FAWNKTDIYG QKVWGLAEAL VSVGYEYETC PDPILWEQRT WLQGFEMDA SNLGDWSLNK 1140 

HHILNPQSGI IHKGKGENMP ISQQPPVIST IMGNGHQRSV ACTNCKGPAH NNKLFAPVAL 1200 

ASGPDGSVYV GDFNFVRRIF PSGNSVSILE LSTSPAHKYY LAMDFVSESL YLSDTNTRKV 1260 

YKLKSLVETK DLSKNFEWA GTGDQCLPFD QSHCGDGGRA SEASLNSPRG ITVDRHGFIY 1320 

FVDGTMIRKI DENAVITTVI GSNGLTSTQP LSCDSGMDIT QVRLEWPTDL AVMPMDNSLY 1380 

VLDNNIVLQI SENRRVRIIA GRPIHCQVPG IDHFLVSKVA IHSTLESARA ISVSHSGLLF 1440 

IAETDERKVN R1QQVTTNGE IYIIAGAPTD CDCXIDPNCD CFSGDGGYAK DAKMKAPSSL 1500 

AVSPDGTLYV ADLGNVRIRT ISRNQAHLND MNIYEIASPA DQELYQFTVN GTHLHTLNLI 1560 

TRDYVYNFTY NSEGDLGAIT SSNGNSVHIR RDAGGMPLWL WPGGQVYWL TISSNGVLKR 1620 

VSAQGYNPAL MTYPGNTGLL ATKSNENGWT TVYEYDPEGH LTNATFPTGE VSSFHSDLEK 1680 

LTKVELDTSN RENVLMSTNL TATSTIYILK QENTQSTYRV NPDGSLRVTP ASGMEIGLSS 1740 

EPHILAGAVN PTLGKCNISL PGEHNANLIE WRQRKEQNKG NVSAFERRLR AHNRNLLSID 1800 

FDHITRTGKI YDDHRKFTLR ILYDQTGRPI LWSPVSRYNE VNITYSPSGL VTFIQRGTWN 1860 

EKKEYDQSGK. IISRTWADGK IWSYTYL.EKS VMLLLHSQRR YIFEYDQSDC LLSVTMPSMV 1920 

RHSLOTMLSV GYYRNIYTPP DSSTSFIQDY SRDGRLLQTL HLGTGRRVLY KYTKQARLSE 1980 

VLYDTTQVTL TYEESSGVIK TIHLMHDGFI CTIRYRQTGP LIGRQ1FRFS EEGLVNARFD 2040 

YSYNNFRVTS MQAVIKETFL PIDLYRYVDV SGRTEQFGKF SVINYDLNQV ITTTVHKHTK 2100 

IPSANGQVIE VQYEILKAIA YWMTIQYDNV GRHGNMCIRV GVDANITRYF YEYDADGQLQ 2160 

TVSVNDKTQW RYSYDLNGDI NLLSHGKSAR LTPLRYDLRD RITRLGEIQY KMDEDGFLRQ 2220 

RGNDIFEYNS NGLLQKAYNK ASGWTVQYYY DGOGRRVASK SSLGQHLQPP VDATANPIRV 2280 

THLYNHTSSE ITSLYYDLQG HLIAMELSSG EEYYVACDNT GTPLAVFSSR GQVIKEILYT 2340 

PYGDIYHDTY PDFQVIIGFH GGLYDFLTKL VHLGQRDYDV VAGRWTTAYH HIWKQLNLLP 2400 

KPFNLYSPEN NYPVGKIQDV AKYTTDIRSW LtELFGFQLKN VLPGFPKPEL ENLELTYELL 2460 

RLQTKTQEWD PGKTILGIQC ELQKQLRNFI SLDQLPMTPR YNDGRCLEGG KQPRFAAVPS 2520 

VFGKGIKFAI KDGIVTADII GVANEDSRRL AAILNNAHYL ENLHFTIEGR DTHYFIKLGS 2580 

LEEDLVLIGN TGGRRI LEKG VNVTVSQMTS LLNGRTRRFA DIQLQHGALC FNIRYGTTVE 26d0 

EEKNHVLEIA RQRAVAQAWT KEQRRLQEGE EGIRAWTEGE KQQLLSTGRV QGYDGYFVLS 2700 
VEQYLELSDS ANNIHFHRQS EIGRR 

SEQ ID NO:231 PFD4 0NA SEQUENCE: 
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Coding sequence: 



225-2567 (undalined sequences conespond to start and slop cottons) 



l 



11 



21 



31 



41 



51 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



CTCAGCCTTC CCGGTTCGGG AAAGGGGAAG AATGCAGGAG GGGTAGGATT TCTTTCCTGA 60 

TAGGATCGGT TGGGAAAGAC CGCAGCCTGT GTGTGTCTTT CCCTTCGACC AAGGTGTCTG 120 

TTGCTCCGTA AATAAAACGT CCCACTGCCT TCTGAGAGCG CTATAAAGGC AGCGGAAGGG 180 

TAGTCCGCGG GGCATTCCGG GCGGGGCGCG AGCAGAGACA GGTCATGGCA GCGCCAGGCG 240 

GCAGGTCGGA GCCGCCGCAG CTCCCCGAGT ACAGCTGCAG CTACATGGTG TCGCGGCCGG 300 

TCTACAGCGA GCTCGCTTTC CAGCAACAGC ACGAGCGGCG CCTGCAGGAG CGCAAGACGC 360 

TGCGGGAGAG CCTGGCCAAG TGCTGCAGTT GTTCAAGAAA GAGAGCCTTT GGTGTGCTAA 420 

AGACTCTTGT GCCCATCTTG GAGTGGCTCC CCAAATACCG AGTCAAGGAA TGGCTGCTTA 480 

GTGACGTCAT TTCGGGAGTT AGTACTGGGC TAGTGGCCAC GCTGCAAGGG ATGGCATATG 540 

CCCTACTAGC TGCAGTTCCT GTCGGATATG GTCTCTACTC TGCTTTTTTC CCTATCCTGA 600 

CATACTTTAT CTTTGGAACA TCAAGACATA TCTCAGTTGG ACCTTTTCCA GTGGTGAGTT 660 

TAATGGTGGG ATCTGTTGTT CTGAGCATGG CCCCCGACGA ACACTTTCTC GTATCCAGCA 720 

GCAATGGAAC TGTATTAAAT ACTACTATGA TAGACACTGC AGCTAGAGAT ACAGCTAGAG 780 

TCCTGATTGC CAGTGCCCTG ACTCTGCTGG TTGGAATTAT ACAGTTGATA TTTGGTGGCT 840 

TGCAGATTGG ATTCATAGTG AGGTACTTGG CAGATCCTTT GGTTGGTGGC TTCACAACAG 900 

CTGCTGCCTT CCAAGTGCTG GTCTCACAGC TAAAGATTGT CCTCAATGTT TCAACCAAAA 960 

ACTACAATGG AGTTCTCTCT ATTATCTATA CGCTGGTTGA GATTTTTCAA AATATTGGTG 1020 

ATACCAATCT TGCTGATTTC ACTGCTGGAT TGCTCACCAT TGTCGTCTGT ATGGCAGTTA 1080 

AGGAATTAAA TGATCGGTTT AGACACAAAA TCCCAGTCCC TATTCCTATA GAAGTAATTG 1140 

TGACGATAAT TGCTACTGCC ATTTCATATG GAGCCAACCT GGAAAAAAAT TACAATGCTG 1200 

GCATTGTTAA ATCCATCCCA AGGGGGTTTT TGCCTCCTGA ACTTCCACCT GTGAGCTTGT 1260 

TCTCGGAGAT GC*TGGCTGCA TCATTTTCCA TCGCTGTGGT GGCTTATGCT ATTGCAGTGT 1320 

CAGTAGGAAA AGTATATGCC ACCAAGTATG ATTACACCAT CGATGGGAAC CAGGAATTCA 1380 

TTGCCTTTGG GATCAGCAAC ATCTTCTCAG GATTCTTCTC TTGTTTTGTG GCCACCACTG 1440 

CTCTTTCCCG CACGGCCGTC CAGGAGAGCA CTGGAGGAAA GACACAGGTT GCTGGCATCA 1500 

TCTCTGCTGC GATTGTGATG ATCGCCATTC TTGCCCTGGG GAAGCTTCTG GAACCCTTGC 1560 

AGAAGTCGGT CTTGGCAGCT GTTGTAATTG CCAACCTGAA AGGGATGTTT ATGCAGCTGT 1620 

GTGACATTCC TCGTCTGTGG AGACAGAATA AGATTGATGC TGTTATCTGG GTGTTTACGT 1680 

GTATAGTGTC CATCATTCTG GGGCTGGATC TCGGTTTACT AGCTGGCCTT ATATTTGGAC 1740 

TGTTGACTGT GGTCCTGAGA GTTCAGTTTC CTTCTTGGAA TGGCCTTGGA AGCATCCCTA 1800 

GCACAGATAT CTACAAAAGT ACCAAGAATT ACAAAAACAT TGAAGAACCT CAAGGAGTGA 1860 

AGATTCTTAG ATTTTCCAGT CCTATTTTCT ATGGCAATGT CGATGGTTTT AAAAAATGTA 1920 

TCAAGTCCAC AGTTGGATTT GATGCCATTA GAGTATATAA TAAGAGGCTG AAAGCGCTGA 1980 

GGAAAATACA GAAACTAATA AAAAGTGGAC AATTAAGAGC AACAAAGAAT GGCATCATAA 2040 

GTGATGCTGT TTCAACAAAT AATGCTTTTG AGCCTGATGA GGATATTGAA GATCTGGAGG 2100 

AACTTGATAT CCCAACCAAG GAAATAGAGA TTCAAGTGGA TTGGAACTCT GAGCTTCCAG 2160 

TCAAAGTGAA CGTTCCCAAA GTGCCAATCC ATAGCCTTGT GCTTGACTGT GGAGCTATAT 2220 

CTTTCCTGGA CGTTGTTGGA GTGAGATCAC TGCGGGTGAT TGTCAAAGAA TTCCAAAGAA 2280 

TTGATGTGAA TGTGTATTTT GCATCACTTC AAGATTATGT GATAGAAAAG CTGGAGCAAT 2340 

GCGGGTTCTT TGACGACAAC ATTAGAAAGG ACACATTCTT TTTGACGGTC CATGATGCTA 2400 

TACTCTATCT ACAGAACCAA GTGAAATCTC AAGAGGGTCA AGGTTCCATT TTAGAAACGA 2460 

TCACTCTCAT TCAGGATTGT AAAGATACCC TTGAATTAAT AGAAACAGAG CTGACGGAAG 2520 

AAGAACTTGA TGTCCAGGAT GAGGCTATGC GTACACTTGC ATCCTGAAAG TGGGTTCGGG 2580 

AGGTCTCTAT GAGCAAGGAA TACAAGACAA AACTTCCTCA ATGCATTGAC TATTTCTTCA 2640 

GACTCAAAAC ACTCATTCTT TTTTCTATTA AGCCATTGAA AGAGAAGCAC TAAGACTGCT 2700 

TCTAGGCTTT ATTTATAAAA TAAACACCTT ATCCCTAACA TGGGCAAAAT GGCTAGAATT 2760 

ATTCAGACGA TTTGGCAGCG TCCAGGGTAA GCTGGTGTTA TAATACGCTG CTGATCTACA 2820 

TCACAGATTT GCTAATAATG TTCACGTGGG CCCTGGCATA TCTCTGTTCA GTTAGAGTGA 2880 

GTGCTGACCC AACAGCCTCT GTGGTCAAGC GAGTCACGAA TGATTAATCA TAAAGAAAAA 2940 

TCAGTTTTTG ACTGACCTGG ATATCCATGA GCTGCACTGA TCACCATGTA AGGTCACATT 3000 

TAGTAAATGC TGAAATAAAA TGATTAATGC ATTTATCAAT AAAAGCCTTT GAAAATACTT 3060 

TGGATAATAA ATTGGAGTTT TAAAAATGCA AATTTGCTTA GTATCTAATA ATGAAGTGTT 3120 

ATTACATATA GCCGGAATTG AGGATCTCTT TGATCCTGGA AATGGTTTAC CTAAAAGCTA 3180 

CAGAACCAGG CCAATATATT TTGAAATATT GATGCAGACA AATGAAATAA TAAAGAGATT 3240 

TTCATGGTTT ATAAAAATCT TTTTTGATAT GATAATAATC ATGATCACAA CTGAGATCAA 3300 

AAAAATATAT GACAGATTAT TTTGTTTAAA AATGCAGTTT TAATTATCTT AGTCTATAGA 3360 

AATGATCATT GCATGGAGGC ATGTATAGGT ATGATCTGTG TAAAATCTGA CATAAAAACA 3420 

GTGCTATTCT GAGTGAAAAT TTTTTTGATG TGCTTACATA ACCATGGTGA TTAAAATGAG 3480 

TTTATATTTT TTCTCAAAAA TTTTAGCAGT GTGTAAAGTA AGTAATCTTT AACTGAACTC 3540 

TGACCACTTA AAAAAAAATC TAAAAATTGA ACTACCTATA GTAGTCTGTG TTTAAAGTGA 3600 

ATTTTTAAAG AGAAAGCATT CTAAATGAAC TCAATATAAA AACATTCATT TGGAATGTAC 3660 

ATACTGAAAA ATACAGGTTT TTTTGACCAA AAGTTTTTAT ATCTTTTCTT TTTATTTATT 3720 

TTTTTCCTAA GTGCCAACAA TTTTCTAGAT ATTATATACA ACACAGGCTT TGATCTTGGG 3780 

GACTTTTCCC ATATATTTCA CACTGGAGTG AATGAAGTTG TACTTCATTT CTAGAGAAAA 3840 

GTTATACCCA GGTCCCCAAT TGAGAATGTC TTGCTTGATT GAAAACGACA TCATCCCTTG 3900 

GTATACTCCA GGGATTGGTT TCAGGACCCC TGCATTTACC AAAATTTGTG CACACTCAAG 3960 

TCCTGCAGTC ACCCCTGCCT AAAGATAGAA TGGCTTCTCT GTTTTTCTTC TGAAATACAA 4020 

CCAGAAACAA TGTGTCTATT TCTGAAAGAA TAGGATTAAT GATCATACAA ATGGGTTAAT 4080 

CCTGAATTCT GGTTGTAAAT CTGGTTACAG CATAACTAGG ATTATAATGC TGCCTCATTT 4140 

TCACAGCACT ACTTGCTTAT ATTGACAACA AATCATCTCG CTAAAGAGTG AATGTAGGCC 4200 

AGGCGCGGTG GCTCATGCCT GTAATCCCAG CACTTTGGGA GGCCGAGGCG GGTGGATCAC 4260 

GAGGTCAGGA GATCGAGACC ATCCTGGCTA ACATGGTAAA ACCCCGTCTC TACTAAAAAT 4320 

AGAAAAAAAG AAATTAGOCT AGCGTGGTGG CTGGCGGGCG CCTGTAGTCC CAGCTATTTG 4380 

GGAGGCTAAG GCAGGAGAAT GGCGTGAACC CGGGAGGCGG AGCTTGCAGT GAGCCGAGGT 4440 

CGTGCCACTG CACTCCAGCC TGGGCGACAG AGCAAGACTC CGTCTCAAAA AAAAAAAAAA 4500 
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AAAAAAAAAA AGAGTGAATG TAATAGTCTT GCAGAAAATG AATGAATACC TTTGTTCAAT 4560 

AAAGGAAATA TGCACTGCTC ACTTTTTTGA AGGAAATGCC AAAGTTACGT TTTACAACAA 4620 

GGCTAGAGTT TGTAAATTCT GGGTTCATTT GTGATGACAT AAGTCAGCAA ACTGCGGGAA 4680 

TACTGTCTCT TCTATGTATT TTGTGAATAG TAAGCATAAT TTTAGTTTTG TATTATCAAT 4740 

GAAAATTTCA CTTGAAATTA AAGCTGCCTT TTGTTATATT TTTAACCTAT AGGATAAGAT 4800 

TCCAGTATTG TATATGAGTT TTAACAAATT AAAAAATCAA ATCATGTACA TTTGAAAATA 4860 

TTTGCACACA TTTAAAAATA AATGTAAAGT TGTCTTTTAA ACTACTCGGA TGTGTCCTTT 4920 
CTGAACAAAA 



SEQ |D HQ:232 ppp4 Protein, segygnce; 

Protein Accession #: 043511 



10 
15 

1 11 21 31 41 51 

I I I I I I 

MAAPGGRSEP PQLPEYSCSY MVSRPVYSEL AFQOQHERRL QERKTLRESL AKCCSCSRKR 60 

AFGVLKTLVP ILEWLPKYRV KEWLLSDVIS GVSTGLVATL QGMAYALLAA VPVGYGLYSA 120 

20 FFPILTYFIF GTSRHISVGP FPWSLMVGS WLSMAPDEH FLVSSSNGTV LNTTMIDTAA 180 

RDTARVLIAS ALTLLVGIIQ LIFGGLQIGF IVRYLADPLV GGFTTAAAFQ VLVSQLKIVL 240 

NVSTKNYNGV LSIIYTLVEI FQNIGDTNLA DFTAGLLTIV VCMAVKBLND RFRHKIPVPI 300 

PIEVIVTIIA TAISYGANLE KNYNAGIVKS IPRGPLPPEL PPVSLFSEML AASFSIAWA 360 

YAIAVSVGKV YATKYDYTID GNQEFIAFGI SNIFSGFFSC FVATTALSRT AVQESTGGKT 420 

25 QVAGIXSAAI VMIAILALGK LLEPLQKSVL AAWTANLKG HFMQLCDIPR LWRQNKIDAV 480 

IWVPTCIVSI ILGLDLGLLA GLIFGLLTW LRVQFPSWNG LGSIPSTDIY KSTKNYKNIE 540 

EPQGVK1LRF SSPIFYGNVD GFKKCIKSTV GFDAIRVYNK RLKALRKIQX LIKSGQLRAT 600 

KNGIISDAVS TNNAFEPDED IEDLEELDIP TKEIEIQVDW NSELPVKVW PKVPIHSLVL 660 

DCGAISFLDV VGVRSLRVIV KEFQRIDVNV YFASLQDYVI EKLEQCGFFD DNIRKDTFFL 720 

30 TVHDAILYLQ NQVKSQEGQG SXLETITLIQ DCKDTLELIE TELTEEELDV QDEAMRTLAS 780 
QDEAMRTLAS 

_ _ SEQ 10 NO:233 PFH2 DMA SEQUENCE: 

35 Nucleic Add Accession* NM_018029 

Coding sequence: 228-1097 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

40 | | | | | | 

CTGCGATCCC GCAGGGCAGC GACGCGACTC TGGTGCGGGC CGTCTTCTTC CCCCCGAGCT 60 

GGGCGTGCGC GGCCGCAATG AACTGGGAGC TGCTGCTGTG GCTGCTGGTG CTGTGCGCGC 120 

TGCTCCTGCT CTTGGTGCAG CTGCTGCGCT TCCTGAGGGC TGACGGCGAC CTGACGCTAC 180 

c TATGGGCCGA GTGGCAGGGA CGACGCCCAG AATGGGAGCT GACTGATATG GTGGTGTGGG 240 

45 TGACTGGAGC CTCGAGTGGA ATTGGTGAGG AGCTGGCTTA CCAGTTGTCT AAACTAGGAG 300 

TTTCTCTTGT GCTGTCAGCC AGAAGAGTGC ATGAGCTGGA AAGGGTGAAA AGAAGATGCC 360 

TAGAGAATGG CAATTTAAAA GAAAAAGATA TACTTGTTTT GCCCCTTGAC CTGACCGACA 420 

CTGGTTCCCA TGAAGCGCCT ACCAAAGCTG TTCTCCAGGA GTTTGGTAGA ATCGACATTC 480 

cn TGGTCAACAA TGGTGGAATG TCCCAGCGTT CTCTGTGCAT GGATACCAGC TTGGATGTCT 540 

50 ACAGAAAGCT AATAGAGCTT AACTACTTAG GGACGGTGTC CTTGACAAAA TGTGTTCTGC 600 

CTCACATGAT CGAGAGGAAG CAAGGAAAGA TTGTTACTGT GAATAGCATC CTGGGTATCA 660 

TATCTGTACC TCTTTCCATT GGATACTGTG CTAGCAAGCA TGCTCTCCGG GGTTTTTTTA 720 

ATGGCCTTCG AACAGAACTT GCCACATACC CAGGTATAAT AGTTTCTAAC ATTTGCCCAG 780 

GACCTGTGCA ATCAAATATT GTGGAGAATT CCCTAGCTGG AGAAGTCACA AAGACTATAG 840 

55 GCAATAATGG AGACCAGTCC CACAAGATGA CAACCAGTCG TTGTGTGCGG CTGATGTTAA 900 

TCAGCATGGC CAATGATTTG AAAGAAGTTT GGATCTCAGA ACAACCTTTC TTGTTAGTAA 960 

CATATTTGTG GCAATACATG CCAACCTGGG CCTGGTGGAT AACCAACAAG ATGGGGAAGA 1020 

AAAGGATTGA GAACTTTAAG AGTGGTGTGG ATGCAGACTC TTCTTATTTT AAAATCTTTA 1080 

- AGACAAAACA TGACTGAAAA GAGCACCTGT ACTTTTCAAG CCACTGGAGG GAGAAATGGA 1140 

60 AAACATGAAA ACAGCAATCT TCTTATGCTT CTGAATAATC AAAGACTAAT TTGTGATTTT 1200 

ACTTTTTAAT AGATATGACT TTGCTTCCAA CATGGAATGA AATAAAAAAT AAATAATAAA 1260 
AGATTGCCAT GAATCTTGCA AA 

65 seq ID NO:234 PFH2 Protein sequence: 

Protein Accession K NP,057113 

1 11 21 31 41 51 

70 | | ! II I 

HNWELLLWLL VJJCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTGASS 60 

GIGEELAYQL SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 120 

ATKAVLQEFG RIDILVNNGG MSQRSLCMDT SUJVYRKL1E LNYLGTVSLT KCVLPKMIER 180 

__. KQGKIVTVNS IUJIISVPLS IGYCASKHAL RGFFNGLRTE LATYPGIIVS NICPGPVQSN 240 

75 IVENSLAGEV TKTIGNNGDQ SHKMTTSRCV RLMLISMAND LKEVWISEQP FLLVTYLWQY 300 
HPTWAWWITN KKGKKRIENF KSGVDADSSY FKIFKTKHD 

SEO tD HO:235 ACC5 DNA SEQUENCE 

80 Kudetc Add Accession i: NM.00O45O 
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Coding sequence: 1-1 833 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGATTGCTT CACAGTTTCT CTCAGCTCTC ACTTTGGTGC TTCTCATTAA AGAGAGTGGA 60 

GCCTGGTCTT ACAACACCTC CACGGAAGCT ATGACTTATG ATGAGGCCAG TGCTTATTGT 120 

CAGCAAAGGT ACACACACCT GCTTGCAATT CAAAACAAAG AAGAGATTGA GTACCTAAAC 180 

TCCATATTGA GCTATTCACC AAGTTATTAC TGGATTGGAA TCAGAAAAGT CAACAATGTG 240 

TGGGTCTGGG TAGGAACCCA GAAACCTCTG ACAGAAGAAG CCAAGAACTG GGCTCCAGGT 300 

GAACCCAACA ATAGGCAAAA AGATGAGGAC TGCGTGGAGA TCTACATCAA GAGAGAAAAA 360 

GATGTGGGCA TCTGGAATGA TGAGAGGTGC AGCAAGAAGA AGCTTGCCCT ATGCTACACA 420 

GCTGCCTGTA CCAATACATC CTGCAGTGGC CACGGTGAAT GTGTAGAGAC CATCAATAAT 480 

TACACTTGCA AGTGTGACCC TGGCTTCAGT GGACTCAAGT GTGAGCAAAT TGTGAACTGT 540 

ACAGCCCTGG AATCCCCTGA GCATGGAAGC CTGGTTTGCA GTCACCCACT GGGAAACTTC 600 

AGCTACAATT CTTCCTGCTC TATCAGCTGT GATAGGGGTT ACCTGCCAAG CAGCATGGAG 660 

ACCATGCAGT GTATGTCCTC TGGAGAATGG AGTGCTCCTA TTCCAGCCTG CAATGTGGTT 720 

GAGTGTGATG CTGTGACAAA TCCAGCCAAT GGGTTCGTGG AATGTTTCCA AAACCCTGGA 780 

AGCTTCCCAT GGAACACAAC CTGTACATTT GACTGTGAAG AAGGATTTGA ACTAATGGGA 840 

GCCCAGAGCC TTCAGTGTAC CTCATCTGGG AATTGGGACA ACGAGAAGCC AACGTGTAAA 900 

GCTGTGACAT GCAGGGCCGT CCGCCAGCCT CAGAATGGCT CTGTGAGGTG CAGCCATTCC 960 

CCTGCTGGAG AGTTCACCTT CAAATCATCC TGCAACTTCA CCTGTGAGGA AGGCTTCATG 1020 

TTGCAGGGAC CAGCCCAGGT TGAATGCACC ACTCAAGGGC AGTGGACACA GCAAATCCCA 1080 

GTTTGTGAAG CTTTCCAGTG CACAGCCTTG TCCAACCCCG AGCGAGGCTA CATGAATTGT 1140 

CTTCCTAGTG CTTCTGGCAG TTTCCGTTAT GGGTCCAGCT GTGAGTTCTC CTGTGAGCAG 1200 

GGTTTTGTGT TGAAGGGATC CAAAAGGCTC CAATGTGGCC CCACAGGGGA GTCGGACAAC 1260 

GAGAAGCCCA CATGTGAAGC TGTGAGATGC GATGCTGTCC ACCAGCCCCC GAAGGGTTTG 1320 

GTGAGGTGTG CTCATTCCCC TATTGGAGAA TTCACCTACA AGTCCTCTTG TGCCTTCAGC 1380 

TGTGAGGAGG GATTTGAATT ATATGGATCA ACTCAACTTG AGTGCACATC TCAGGGACAA 1440 

TGGACAGAAG AGGTTCCTTC CTGCCAAGTG GTAAAATGTT CAAGCCTGGC AGTTCCGGGA 1500 

AAGATCAACA TGAGCTGCAG TGGGGAGCCC GTGTTTGGCA CTGTGTGCAA GTTCGCCTGT 1560 

CCTGAAGGAT GGACGCTCAA TGGCTCTGCA GCTCGGACAT GTGGAGCCAC AGGACACTGG 1620 

TCTGGCCTGC TACCTACCTG TGAAGCTCCC ACTGAGTCCA ACATTCCCTT GGTAGCTGGA 1680 

CTTTCTGCTG CTGGACTCTC CCTCCTGACA TTAGCACCAT TTCTCCTCTG GCTTCGGAAA 1740 

TGCTTACGGA AAGCAAAGAA ATTTGTTCCT GCCAGCAGCT GCCAAAGCCT TGAATCAGAC 1800 
GGAAGCTACC AAAAGCCTTC TTACATCCTT TAA 



SEQ ID NQ:236 AQp5 Pryjeln seg^nqe: 
Protein Accession I: NP_000441 



1 11 21 31 41 51 

I I I I I I 

MIASQFLSAL TLVLLIKESG AWSYNTSTEA MTYDEASAYC QQRYTHLVAI QNKEEIEYLN 60 

SILSYSPSYY WIGIRKVNNV WVWVGTQKPL TEEAKNWAPG EPNNRQKDED CVEIYIKREK 120 

DVGMWNDERC SKKKLALCYT AACTNTSCSG HGECVETINN YTCKCDPGFS GLKCEQIVNC 180 

TALES PEHGS LVCSHPLGNF SYNSSCSISC DRGYLPSSME TMQCMSSGEW SAPIPACNW 240 

ECDAVTNPAN GPVECFQNPG SPPWNTTCTF DCEEGFELMG AQSLQCTSSG NWBNEKPTCK 300 

AVTCRAVRQP QNGSVRCSHS PAGEFTPKSS CNFTCEEGFM LQGPAQVECT TQGQWTQQIP 360 

VCEAF QCTAL SNPERGYMNC LPSASGSFRY GSSCEFSCEQ GFVLKGSKRL QCGPTGEWDN 420 

EKPTCEAVRC DAVHQPPKGL VRCAHSPIGE FTYKSSCAFS CEEGFELYGS TQLECTSQGQ 480 

WTEEVPSCQV VKCSSLAVPG KINMSCSGEP VFGTVCKFAC PEGWTLNGSA ARTCGATGHW 540 

SGLLPTCEAP TESMIPLVAG LSAAGLSLLT LAPFLLWLRK CLRKAKKFVP ASSCQSLESD 600 
GSYQKPSYIL 

SEQ ID N0:237 PM28 DNA SEQUENCE 

Nucleic Acid Accession!: NS1002 

Coding sequence: 1-3793 (underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGATGTGTG AAGTGATGCC CACGATTAAT GAGGACACCC CAATGAGCCA AAGGGGGTCC 60 

CAAAGCAGTG GCTCGGACTC AGACTCCCAT TTTGAGCAGC TGATGGTGAA TATGCTAGAT 120 

GAAAGGGATC GTCTTCTAGA CACCCTTCGG GAGACCCAGG AAAGCCTCTC ACTTGCCCAG 180 

CAAAGACTTC AGGATGTCAT CTATGACCGA GACTCACTCC AGAGACAGCT CAATTCAGCC 240 

CTGCCACAGG ATATCGAATC CCTAACAGGA GGGCTGGCTG GTTCTAAGGG GGCTGATCCA 300 

CCGGAATTTG CTGCACTGAC AAAAGAATTA AATGCCTGCA GGGAACAACT TCTAGAAAAG 360 

GAAGAAGAAA TCTCTGAACT TAAAGCTGAA AGAAACAACA CAAGACTATT ACTGGAGCAT 420 

TTGGAGTGCC TTGTGTCACG ACATGAAAGA TCACTAAGAA TGACGGTGGT AAAACGGCAA 480 

GCCCAGTCTC CCTCAGGAGT ATCCAGTGAA GTTGAAGTTC TCAAGGCACT GAAATCTTTG 540 

TTTGAGCACC ACAAGGCCTT GGATGAAAAG GTAAGGGAGC GACTGAGGGT TTCTTTAGAA 600 

AGAGTCTCTG CACTGGAAGA AGAACTAGCT GCTGCTAATC AGGAGATTGT TGCCTTGCGT 660 

GAACAAAATG TTCATATACA AAGAAAAATG GCATCAAGCG AGGGATCCAC AGAGTCAGAA 720 

CATCTTGAAG GGATGGAACC TGGACAGAAA GTCCATGAGA AGCGTTTGTC CAATGGTTCT 780 

ATAGACTCAA CCGATGAAAC TAGTCAAATA GTTGAACTAC AAGAATTGCT TGAAAAGCAA 840 

AACTATGAAA TGGCCCAGAT GAAAGAACGT TTAGCAGCCC TTTCTTCCCG AGTGGGAGAG 900 

GTGGAACAGG AAGCAGAGAC AGCAAGAAAG GATCTCATTA AAACAGAAGA AATGAACACC 960 

AAGTATCAAA GGGACATTAG GGAGGCCATG GCACAAAAGG AAGATATGGA AGAAAGAATT 1020 

ACAACCCTTG AAAAGCGTTA CCTCAGTGCT CAGAGAGAAT CTACCTCCAT ACATGACATG 1080 
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AATGATAAAC TAGAAAATGA GTTAGCAAAT AAAGAAGCTA TCCTACGGCA GATGGAAGAG 1140 

AAAAACAGAC AGTTACAAGA ACGTCTTGAG CTAGCTGAAC AAAAGTTGCA GCAGACCATG 1200 

AGAAAGGCTG AAACCTTGCC TGAAGTAGAG GCTGAACTGG CTCAGAGAAT TGCAGCCCTA 1260 

ACCAAGGCTG AAGAGAGACA TGGAAATATT GAAGAACGTA TGAGACATTT AGAGGGTCAA 1320 

CTTGAAGAGA AGAATCAAGA ACTTCAAAGA GCTAGGCAAA GAGAGAAAAT GAATGAGGAG 1380 

CATAACAAGA GATTATCGGA TACGGTTGAT AGACTTCTGA CTGAATCCAA TGAACGCCTA 1440 

CAACTACACT TAAAGGAAAG AATGGCTGCT CTAGAAGAAA AGAATGTTTT AATTCAAGAA 1500 

TCAGAAACTT TCAGAAAGAA TCTTGAAGAA TCTTTACATG ATAAGGAAAG ATTAGCAGAA 1560 

GAAATTGAAA AGCTGAGATC TGAACTTGAC CAATTGAAAA TGAGAACTGG CTCTTTAATT 1620 

GAACCCACAA TACCAAGAAC TCATCTAGAC ACCTCAGCTG AGTTGCGGTA CTCAGTGGGA 1680 

TCCCTAGTGG ACAGCCAGTC TGATTACAGA ACAACTAAAG TAATAAGAAG ACCAAGGAGA 1740 

GGCCGCATGG GTGTGCGAAG AGATGAGOCA AAGGTGAAAT CTCTTGGGGA TCACGAGTGG 1800 

AATAGAACTC AACAGATTGG AGTACTAAGC AGCCACCCTT TTGAAAGTGA CACTGAAATG 1860 

TCTGATATTG ATGATGATGA CAGAGAAACA ATTTTTAGCT CAATGGATCT TCTCTCTCCA 1920 

AGTGGTCATT CCGATGCCCA GACGCTAGCC ATGATGCTTC AGGAACAATT GGATGCCATC 1980 

AACAAAGAAA TCAGGCTAAT TCAGGAAGAA AAAGAATCTA CAGAGTTGCG TGCTGAAGAA 2040 

ATTGAAAATA GAGTGGCTAG TGTGAGCCTC GAAGGCCTGA ATTTGGCAAG GGTCCACCCA 2100 

GGTACCTCCA TTACTGCCTC TGTTACAGCT TCATCGCTGG CCAGTTCATC TCCCCCCAGT 2160 

GGACACTCAA CTCCAAAGCT CACCCCTCGA AGCCCTGCCA GGGAAATGGA TCGGATGGGA 2220 

GTCATGACAC TGCCAAGTGA TCTGAGGAAA CATCGGAGAA AGATTGCAGT TGTGGAAGAA 2280 

GATGGTCGAG AGGACAAAGC AACAATTAAA TGTGAAACTT CTCCTCCTCC TACCCCTAGA 2340 

GCCCTCAGAA TGACTCACAC TCTCCCTTCT TCCTACCACA ATGATGCTCG AAGTAGTTTA 2400 

TCTGTCTCTC TTGAGCCAGA AAGCCTCGGG CTTGGTAGTG CCAACAGCAG CCAAGACTCT 2460 

CTTCACAAAG CCCCCAAGAA GAAAGGAATC AAGTCTTCAA TAGGACGTTT GTTTGGTAAA 2520 

AAAGAAAAAG CTCGACTTGG GCAGCTCCGA GGCTTTATGG AGACTGAAGC TGCAGCTCAG 2580 

GAGTCCCTGG GGTTAGGCAA ACTCGGAACT CAAGCTGAGA AGGATCGAAG ACTAAAGAAA 2640 

AAGCATGAAC TTCTTGAAGA AGCTCGGAGA AAGGGATTAC CTTTTGCCCA GTGGGATGGG 2700 

CCAACTGTGG TCGCATGGCT AGAGCTTTGG TTGGGAATGC CTGCGTGGTA CGTGGCAGCC 2760 

TGCCGAGCCA ACGTGAAGAG TGGTGCCATC ATGTCTGCTT TATCTGACAC TGAGATCCAG 2820 

AGAGAAATTG GAATCAGCAA TCCACTGCAT CGCTTAAAAC TTCGATTAGC AATCCAGGAG 2880 

ATGGTTTCCC TAACAAGTCC TTCAGCTCCT CCAACATCTC GAACTCCTTC AGGCAACGTT 2940 

TGGGTGACTC ATGAAGAAAT GGAAAATCTT GCAGCTCCAG CAAAAACGAA AGAATCTGAG 3000 

GAAGGAAGCT GGGCCCAGTG TCCGGTTTTT CTACAGACCC TGGCTTATGG AGATATGAAT 3060 

CATGAGTGGA TTGGAAATGA ATGGCTTCCC AGCTTGGGGT TACCTCAGTA CAGAAGTTAC 3120 

TTTATGGAAT GCTTGGTAGA TGCAAGAATG TTAGATCACC TAACAAAAAA AGATCTCCGT 3180 

GTCCATTTAA AAATGGTGGA TAGTTTCCAT CGAACAAGTT TACAATATGG AATTATGTGC 3240 

TTAAAGAGGT TGAATTATGA CAGAAAAGAA CTAGAAAGAA GACGGGAAGC AAGCCAACAT 3300 

GAAATAAAAG ACGTGTTGGT GTGGAGCAAT GACCGAATTA TTCGCTGGAT ACAAGCAATT 3360 

GGACTTCGAG AATATGCAAA TAATATACTT GAGAGCGGTG TGCATGGCTC ACTTATAGCC 3420 

CTGGATGAAA ACTTTGACTA CAGCAGCTTA ACTTTATTAT TACAGATTCC AACACAGAAC 3480 

ACCCAGGCAA GGCAGATTCT TGAAAGAGAA TACAATAACC TCTTGGCCCT GGGAACTGAA 3540 

AGGCGACTGG ATGAAAGTGA TGACAAGAAC TTCAGACGTG GATCAACCTG GAGAAGGCAG 3600 

TTTCCTCCTC GTGAAGTACA TGGAATCAGC ATGATGCCTG GGTCCTCAGA AACATTACCA 3660 

GCTGGATTTA GGTTAACCAC AACCTCTGGG CAATCAAGAA AAATGACAAC AGATGTTGCT 3720 

TCATCAAGAC TGCAGAGGTT AGACAACTCC ACTGTTCGCA CATACTCATG TCTCGAGTAA 3780 
GCGGCCGCTT TAA 



SEQ ID Nfc238 PM28 Protein sequence: 
Protein Accession #: none found 

l 11 21 31 41 51 

I I I I I I 

HHCEVMPTIN EDTPMSQRGS QSSGSDSDSH FEQLMVNMLD ERDRLLDTLR ETQESLSLAQ 60 

QRLQDVIYDR DSLQRQLNSA LPQDIESLTG GLAGSKGADP PEFAALTKEL NACREQLLEK 120 

EEEISELKAE RNNTRLLLEH LECLVSRHER SLRWIWKRC; AQSPSGVSSE VEVLKALKSL 180 

FEHHKALDEK VRERLRVSLE RVSALEEELA AANQEIVALR EQNVHIQRKH ASSEGSTESE 240 

HLEGMEPGQK VHEKRLSNGS IDSTDETSQI VELQELLEKQ NYEMAQMKER LAALSSRVGE 300 

VEQEAETARK DLIKTEEMNT KYQRDIREAM AQKEDMEERI TTLEKRYLSA QRESTSIHDM 360 

NDKLENELAN KEAILRQMEE KNRQLQERLE LAEQKLQQTM RKAETLPEVE AELAQRZAAB 420 

TKAEERHGNI EERMRHLEGQ LEEKNQELQR ARQREKMNEB HNKRLSDTVD RLLTESNERL 480 

QLHLKERMAA LEEKNVLIQE SETFRKNLEE SLHOKERLAE EIEKLRSELD QLKHRTGSLI 540 

EPTI PRTHLD TSAELRYSVG SLVDSQSDYR TTKVIRRPRR GRKGVRRDEP KVKSLGDHEW 600 

NRTQQIGVLS SHPPESDTEM SDIDDDDRET IFSSMDLLSP SGHSDAQTLA MMLQEQLDAI 660 

NKEIRLIQEE KESTELRAEE IENRVASVSL EGLNLARVHP GTSITASVTA SSLASSSPPS 720 

GHSTPKLTPR SPAREMDRMG VMTLPSDLRK HRRKIAWEE DGREDKATIK CETSPPPTPR 780 

ALRMTHTLPS SYHNDARSSL SVSLEPESLG LGSANSSQDS LHKAPKKKGI KSSIGRLFGK 840 

KEKARLGQLR GFMETEAAAQ ESLGLGKLGT QAEKDRRLKK KHELLEEARR KGLPFAQWDG 900 

PTWAWLELW LGMPAWYVAA CRANVKSGAI MSALSDTEIQ REIGISNPLH RLKLRLAIQE 960 

HVSLTSPSAP PTSRTPSGNV WVTHEEMENL AAPAKTKESE EGSWAQCPVF LC/TLAYGDMN 1020 

HEWIGNEWLP SLGLPQYRSY FMECLVDARM LDHLTKKDLR VHLKMVDSFH RTSLQYGIMC 1080 

LKRLNYDRKE LERRREASQH EIKDVLVWSN DRIIRWIQAI GLREYANWIL ESGVHGSLIA 1140 

LOENFDYSSL TLLLQIPTON TQARQILERE YNNLLALGTE RRLDESDDKN FRRGSTWRRQ 1200 
FPPREVHGIS MMPGSSETLP AGFRLTTTSG QSRKMTTDVA SSRLQRLDNS TVRTYSCLE 

SEQ 10 NO:239 P«4 DMA SEQUENCE 

Nucleic Acid Accession #: NM_01 6570 

Cotfng sequence: 1-1134 (underfined sequences correspond to start and stop codons) 
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1 11 21 31 41 SI 

I I I I I I 

ATGAGGCGAC TGAATCGGAA AAAAACTTTA AGTTTGGTAA AAGAGTTGGA TGCCTTTCCG 60 

AAGGTTCCTG AGAGCTATGT AGAGACTTCA GCCAGTGGAG GTACAGTTTC TCTAATAGCA 120 

TTTACAACTA TGGCTTTATT AACCATAATG GAATTCTCAG TATATCAAGA TACATGGATG 180 

AAGTATGAAT ACGAAGTAGA CAAGGATTTT TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

ACTGTTGCCA TGAAGTGTCA ATATGTTGGA GCGGATGTAT TGGATTTAGC AGAAACAATG 300 

GTTGCATCTG CAGATGGTTT AGTTTATGAA CCAACAGTAT TTGATCTTTC AOCACAGCAG 360 

AAAGAGTGGC AGAGGATGCT GCAGCTGATT CAGAGTAGGC TACAAGAAGA GCATTCACTT 420 

CAAGATGTGA TATTTAAAAG TGCTTTTAAA AGTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

GATGATTCAT CACAGTCTCC AAATGCATGC AGAATTCATG GCCATCTATA TGTCAATAAA 540 

GTAGCAGGGA ATTTTCACAT AACAGTGGGC AAGGCAATTC CACATCCTCG TGGTCATGCA 600 

CATTTGGCAG CACTTGTCAA CCATGAATCT TACAATTTTT CTCATAGAAT AGATCATTTG 660 

TCTTTTGGAG AGCTTGTTCC AGCAATTATT AATCCTTTAG ATGGAACTGA AAAAATTGCT 720 

ATAGATCACA ACCAGATGTT CCAATATTTT ATTACAGTTG TGCCAACAAA ACTACATACA 780 

TATAAAATAT CAGCAGACAC CCATCAGTTT TCTGTGACAG AAAGGGAACG TATCATTAAC 840 

CATGCTGCAG GCAGCCATGG AGTCTCTGGG ATATTTATCA AATATGATCT CAGTTCTC T T 900 

ATGGTGACAG TTACTGAGGA GCACATGCCA TTCTGGCAGT TTTTTGTAAG ACTCTGTGGT 960 

ATTGTTGGAG GAATCTTTTC AACAACAGGC ATGTTACATG GAATTGGAAA ATTTATAGTT 1020 

GAAATAATTT GCTGTCGTTT CAGACTTGGA TCCTATAAAC CTGTCAATTC TGTTCCTTTT 1080 
GAGGATGGCC ACACAGACAA CCACTTACCT CTTTTAGAAA ATAATACACA TTGA 



gEQ K) HQ:?4p PQ4 Prpftif) 

Protein Accessions NPJ>57654 

1 11 21 31 41 51 

I I I I I I 

MRRLNRKKTL SLVKELDAFP KVPESYVETS ASGGTVSLIA FTTMALLTIM EFSVYQDTWM 60 
KYEYEVDKDF SSKLRINIDI TVAMKCQYVG ADVLDLAETM VASADGLVYE PTVFDLSPQQ 120 
KEWQRMLQLI QSRLQEEHSL QDVIFKSAFK STSTALPPRE DDSSQSPNAC RIHGHLYVNK 180 
VAGNFHITVG KAI PHPRGHA HLAALVNHES YNFSHRIDHL SFGELVPAII NPLDGTEKIA 240 
IDHNQMFQYF ITWPTKLHT YKISADTHQF SVTERERIIN HAAGSHGVSG IFMKYDLSSL 300 
MVTVTEEHMP FWQFFVRLCG IVGGIFSTTG MLHGIGKFIV EIICCRFRLG SYKPVNSVPF 360 
EDGHTDNHLP LLENNTH 

SEO ID NO:241 PBA7 DNA SEQUENCE 

Nucleic Acid Accession*: AA219134 

Coding sequence: 24-1815 (underlined sequences correspond to start and stop codons) 



AATTCCCCCT TGCTTAATTA AGCATGTTTA CCTTCCTGTC ATCTGTCACT GCTGCTGTCA 60 
GTGGCCTCCT GGTGGGTTAT GAACTTGGGA TCATCTCTGG GGCTCTTCTT CAG ATCAAAA 120 
CCTTATTAGC CCTG AGCTGC CATG AGCAGG AAATGGTTGT GAGCTCCCTC GTCATTGGAG 1 80 
CCCTCCTTGC CTCACTCACC GGAGGGGTCC TGATAGACAG ATATGGAAGA AGGACAGCAA 240 
TCATCTTGTC ATCCTGCCTG CTTGGACTCG GAAGCTTAGT CTTGATCCTC AGTTTATCCT 300 
ACACGGTTCT TATAGTGGGA CGCATTGCCA TAGGGGTTTC CATCTCCCTC TCTTCCATTG 360 
CCACTTGTGT TTACATCGCA GAG ATTGCTC CTCAACACAG AAGAGGCCTT CTTGTGTCAC 420 
TGAATGAGCT GATGATTGTC ATCGGCATTC TTTCTGCCTA TATTTCAAAT TACGCATTTG 480 
CCAATGTTTT CCATGGCTGG AAGTACATGT TTGGTCTTGT GATTCCCTTG GGAGTTTTGC 540 
AAGCAATTGC AATGTATTTT CTTCCTCCAA GCCCTCGGTT TCTGGTGATG AAAGG ACAAG 600 
AGGGAGCTGC TAGCAAGGTT CTTGGAAGGT TAAGAGCACT CTCAGATACA ACTGAGGAAC 660 
TCACTGTGAT CAAATCCTCC CTGAAAGATG AATATCAGTA CAGTTTTTGG GATCTGTTTC 720 
GTTCAAAAG A CAACATGCGG ACCCGAATAA TGATAGGACT AACACTAGTA TTTTTTGTAC 780 
AAATCACTGG CCAACCAAAC ATATTGTTCT ATGCATCAAC TGTTTTGAAG TCAGTTGGAT 840 
TTCAAAGCAA TGAGGCAGCT AGCCTCGCCT CCACTGGGGT TGGAGTCGTC AAGGTCATTA 900 
GCACCATCCC TGCCACTCTT CTTGTAGACC ATGTCGGCAG CAAAACATTC CTCTGCATTG 960 
GCTCCTCTGT GATGGCAGCT TCGTTGGTGA CCATGGGCAT CGTAAATCTC AACATCCACA 1020 
TGAACTTCAC CCATATCTGC AGAAGCCACA ATTCTATCAA CCAGTCCTTG GATGAGTCTG 1080 
TGATTTATGG ACCAGGAAAC CTGTCAACCA ACAACAATAC TCTCAGAGAC CACTTCAAAG 1140 
GGATTTCTTC CCATAGCAGA AGCTCACTCA TGCCCCTGAG AAATGATGTG GATAAGAGAG 1200 
GGGAGACGAC CTCAGCATCC TTGCTAAATG CTGGATTAAG CCACACTGAA TACCAGATAG 1260 
TCACAGACCC TGGGGACGTC CCAGCTTTTT TGAAATGGCT GTCCTTAGCC AGCTTGCTTG 1320 
TTTATGTTGC TGCTT1 1 1CA ATTGGTCTAG GACCAATGCC CTGGCTGGTG CTCAGCGAGA 1380 
TCTTTCCTGG TGGGATCAGA GGACGAGCCA TGGCTTTAAC TTCTAGCATG AACTGGGGCA 1440 
TCAATCTCCT CATCTCGCTG ACATTTTTGA CTGTAACTGA TCTTATTGGC CTGCCATGGG 1500 
TGTGCTTTAT ATATACAATC ATGAGTCTAG ATCT TATTGG CCTGCCATGG GTGTGCTTTA 1560 
TATATACAAT CATGAGTCTA GCATCCCTGC TTTTTGTTGT TATGTTTATA CCTGAGACAA 1620 
AGGGATGCTC TTTCGAACAA ATATCAATGG AGCTAGCAAA AGTGAACTAT GTGAAAAACA 1680 
ACATTTGTTT TATG AGTC AT C ACCAAG A AG AATTAGTGCC AAAACAGCCT CAAAAAAGAA 1 740 
AACCCCAGGA GCAGCTCTTG GAGTGTAACA AGCTGTGTGG TAGGGGCCAA TCCAGGCAGC 1800 
TTTCTCCAG A G ACCTAATGG CCTCAACACC TTCTG AACGT GGATAGTGCC AG AACACTTA 1 860 
GGAGGGTGTC TTTGGACCAA TGCATAGTTG CGACTCCTGT GCTCTCTTTT CAGTGTCATG 1 920 
GAACTGGTTT TG AAGAGACA CTCTGAAATG ATAAAGACAG CCTTTAATCC CCCTCCTCMC 1980 
CAGAAGGAAC CTCAAAAGGT AGATGAGGTA CAAGGTCCTA AGTGATCTCT 1 1 1 ICT GAOC 2040 
AGG ATATCAG GTTAAAAAAA AAAAGTTACT GGCTGGTTTA ATACTTTCTA CCTTCTTCAC 2100 
AGAGCAGCCT TTGAATAGAC TATGTCCTAG TGAAGACATC AACCTCCGCC TTAAGCTATG 2160 
TATGTATGG A GGCCAGTCGC AGCTTTATTA TGCAGACACA CAAGTGGTCT GGACATGAGG 2220 
GTACAGTTTC TGCCTACCAA GACACTACTT GCACTGGATC TTACGCAAAA AAGAACCAG A 2280 
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ACACACAGTG TGG ACAACTG CCCATATATT CTATCTAGAT TAGGAG AGGG TCCTGGCTAG 2340 
GATTTTAGTG GTAATTCCTA GTTACATTCA ACAAGTATAA AGATTATAGA GCTTATTTTA 2400 
TGAACTATAA ACTATAATTT AATGCAAA AT ATCCTTTTAT GAATTTCATG TTAATATTGT 2460 
GAAATATTAA AATAATTCCR CAATAGTTGA GAAAAATGAG CATTTTTTTC CATTTTTAAA 2520 
AAATGCATAG AAAAG ACAAT TTTAAAATCC TGGGACC ATA TTTATTTAGA AGTAGCTGTT 2580 
AGTAAAACAT TAGAAAAGGA GTCAGGCCAT TAGGTTATTT ATCCAAATCT CTAAGCAATT 2640 
AGGTTGAAGT TATTAAGTCA AGCCTAGAAA AGCTGCCTCC TTGTAAGGCT TTCATGACAA 2700 
TGTATAGTAA TCCACAGTGT CCAATTCTTC ACACTCCTCA GGAATATCAC TACCTCAGGT 2760 
TACGGTACAC AGGCTATAAT TGATGATGAT GTTCAGATAA CTGAAGACAC AATAAATGAC 2820 
ATTCAGACAT CAGGAMAA WW CCCTCATGTT CTTTTCTATG ATGGCCACCT GTACCAGCAA 2880 
CGTGGGTTTC ACCCACACAA CGATGAACTG TTCTCTTACT TCTCCAGTTG ATTTTAAAGA 2940 
CTTGTTAAGA GGTCTTACTA ATAA AATTTG GGTATGATAG AAAAWCCACA ATCAAAWCTT 3000 
GAACCAAATA AC AT ATT AAA TTACTAATAT TTAAGTGATG GAAGACACAC AAAAAACTTA 3060 
AAAGCACGAA CAACCTAACT TGAAAAAGAA TTTTAAAATA TGATTAACCT GAAGAAAAGA 3120 
GAATCCTAAG AGCCAAAGCT CCTTTTTATT TAGCTTGGAA TTTTCCTATT GGTTCCTAAC 3180 
AAACTGTCCC AATGTCATAT AAGGAAACAT GATCTATTAC ATTCCTTTAT AACAATGTGG 3240 
AGAGACTATA AACCTATGTA AGTAGTAAAA CTATATYAGA GACTCAGGAG ACTGACTAAA 3300 
AGGCCTGG AT CTGCAGTGTA TTATCTGTAT AAAAATTGGC AGGGGG AAGC TAAAAGG AAA 3360 
GGAGATTGGA GATCTCAATT CTATCATGGT GTATTTCATA CGCAAATCAG AGCATGCATT 3420 
GTTTTTTGTT TTTGG AAAGA G AAGGGAAGT GTGTTCTGCC CCATGTTTCC TTCCGTGTTT 3480 
ATAGTTCAAA CTCTATATAT ACTTCAGGTA TTTTTTGTTT AGCCCTTCAT TATAAATGGG 3540 
CAGGAAATTG TTTATCAACC TAGCCAGTTT ATTACTAGTG ACCTTG ACTT CAGTATCTTG 3600 
AGCATTCTTT TATATTTTTC TTTTATTATC CTGAGTCTGT AACTAAACAA TTTTGTCTTC 3660 
AAA 1 1 ITTAT CCAATATCCA TTGCACCACA CCAAATCAAG CTTCTTG ATT TTCAAAAATA 3720 
AAAAGGGGGA AATACTTACA ACTTGTACAT ATATATTCAC AGTTTTTATT TATAAAAAAA 3780 
ATTTACAGTA CTTATGGAG A GCCAGCAG AA GACATCAGAG CACTCACTTC TTCCCATCTT 3840 
TGTTAAGGTT AGCGAATTAC CCATGGACAC TGTTAGGTGA GGCTCATTCG GCAGCCCTGA 3900 
AAACAAACCT GGTCACACTG TCTTTACCCT CTCCCTTCAG ATAAAGCACT TCGATTATCT 3960 
ATTGATCTGC CCAGTTTTCA AGTCATGCGA ATACTAAAAA GGTTACATCA TCTGGATCTG 4020 
TACCTTGGCT ATATAAGC AT GTTTTCCCCC TATTCTATGT TTCTTTTTTT GGTG AACATT 4080 
GAAAAACAGG AGGTGACTTA TTACTGTTAA TTAAAACTAA ATGAAAAATG TCAAGTCTTT 4140 
AAAACAGTGA GCTTGTAACT CTTTCATGTA ATTTTATTCT CTATGAATTT GGCTATCCTA 4200 
CTGAATCTTA AAATAAAGGA AATAAACACT TTTTTTTWAA AAAAAAGGAA AAATAMAARW 4260 
MWA AAAATCT CAATG AAATA TTTCACAAG A AGG AAA A A 



SEQIPNOattpBffiPWnsWVW 



Protein Accession #: AAF91431 

MFTFLSSVTA AVSGLLVGYE LGI1SGAIXQ IKTLLALSCH EQEMWSSLV IGALLASLTG 60 
GVUDRYGRR TAHLSSCLL GLGSLVULS LSYTVLIVGR 1AIGVSISLS SIATCVYIAE 120 
IAPQHRRGLL VSLNELMIVI GILSAY1SNY AFANVFHGWK YMTGLVIPLG VLQAIAMYFL 180 
PPSPRFLVMK GQEGAASKVL GRLRALSDTT EELTV1KSSL KDEYQYSFWD LFRSKDNMRT 240 
RJM1GLTLVF FVQITGQPNI LFYASTVLKS VGFQSNEAAS LASTGVGWK VISTEPATLL 300 
VDHVGSKTFL C1GSSVMAAS LVTMGIVNLN fflMNFTHICR SHNSINQSLD ESVTYGPGNL 360 
STNNNTLRDH FKGISSHSRS SLMPLRNDVD KRGETTSASL LNAGLSHTEY QIVTDPGDVP 420 
AFLKWLSLAS LLVYVAAFS1 GLGPMPWLVL SEIFPGGIRG RAMALTSSMN WGINLLISLT 480 
FLTVTDLIGL PWVCFTYTTM SLDLIGLPWV CFIYTTMSLA SLLFWMFIP ETKGCSLEQ1 540 
SMELAKVNYV KNNICFMSHH QEELVPKQPQ KRKPQEQLLE CNKLCGRGQS RQLSPET 

SEQ ID NO:243 PAB4 DNA sequence: 
r^eic Acid Accession* AA172056 

Coding sequence: 121-339 (underlined sequences correspond to start and stop codons) 

TTTAGCCACC AGAGGANTTC TCTTGAAATA CCCAAAATCC ATCAGTATCT TGAATCATGC 60 
TGGATTTTG A AG AATTCTTA AGAAGCCATG TAAAGGGGGC TCTCTGGCCT TGAAATAGTG 120 
ATfiTTTTTTA TACAGAAAGG AG AATGCAGA ATGGTCAGAC TATCATGCAC TGTTAAATTT 1 80 
GATTTCAAG A AATTACAGG A A AACTTTCCA AAGTTCCATC TCACAG AANN TTATTTTNCC 240 
AAG AATTCCA AG ATAAGTTT AGTTTTATGG AAGACTTTTA TGTGGTTTTT ACTCACTCTT 300 
CATCTCAGAC ATCGACAGAT G ATTACATCA CTTATAGTTC TAGTAAATTT ATTAATATAA 360 
AACTCAGAG A CATTCCAATA TCCACATTGC TTACACCATT AGGCATAGAT TCAGTGTCAG 420 
CTATG ACAAT TG AAAATG AG CTGTTTTGTG ATTTAAAGGT TTAAATTTCT CTAACCAAAC 480 
TGCTTGATCC AGATGCAGGA CTGCAAATGT TAATATTTGT TCTGGAAGAA CAATCAAATA 540 
AG ACTTAAG A GGAAAGGG AA TGGCCACAAT CCACCTG AAA TTTTTTCTTA AAAAGTGTGC 600 
AGCCTACTAA ATCAGAATGA A AATAG A AGT ACAAGATTAT AAACAAAATG CAATCAAACT 660 
TTTCTTAAGC TTACCTAAAG TTATTTCATC TG AAAATTTC AAGCAACTTT GTTC AACATT 720 
AAATTGACAA TCTAAACTAA CAAGTCTTTT GA ATTTAT GC ATGGTAGTAA ACATTCTCTC 780 
TATTAACTTT ATTACCTAAG GCTAAACCTA AAATTTTTAA GCAAAATTAG AAAAATAGTC 840 
TTCACTCATC AAAAAATAAA GTTTGTTACA TTTAGTATTT TCCCAATAAA ATTGGTCGTT 900 
CTTGGTTTTT TATTTGG AGA GTCTGTGCAA AATGTCACTA AAAATAAATT AGCACTAG AA 960 
ATTATTTCTA AATACCAAA 

SEQ ID NO:244 PBQ8 DNA SEQUENCE 

Nucleic Acid Accession*: X5 1 405 

Cooing sequence: 3-1721 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



AAATGGCGTG 
CCTGGGCTCC 
GTCGCCCCAG 
GGTGCGGAAC 
AAGAGGCCGC 
GAGGGGGCAG 
GCGCCGAAGC 
AGCAAGAGGA 
TGTCCGTGTG 
AGGGCCGGGA 
AGCCTGAATT 
TCATTTTCTT 
ACCTGATCCA 
AGGCAGCGTC 
GAATAGATCT 
AAGGTGGTCC 
AGCTTGCTCC 
CTGCCAATCT 
GTAGTGCTCA 
CATACTCTTC 
ATGATGACAG 
GAGGGATGCA 
GCTGTGAGAA 
CCCTCATTAG 
AAGGTAACCC 
CCGCAAAGGA 
CAGCTCCAGG 
GGGTTGATTT 
TGGAATGGTG 
CTTTAAATCT 
CAGTTAATAC 
AAATAAATAG 
TATTCATTTT 
ATCCTAGGCT 
TCTAGCTTTC 
AATGCTATTG 
TAAATAGTTC 
TGTTAATGCA 
AATAAAAATT 
TTAACACTAC 
CTGAATGAAT 



I 

CCCGTCTCTC 
GCGGCCAGTA 
TGCGCGGGCT 



CCGCGTAGGA 
CGCGCTGCTG 
CCAGGAGCCC 
CGGCATCTCC 
GCTGCAGTGC 

TAAATACATT 
GGCCCAGTAC 
CAGTACCCGC 
TCAGCCTGGT 
GAACCGGAAC 
AAATAATCAT 
TGAGACCAAG 
CCATGGAGGA 
CGAATACAGC 
TTTCAACCCG 
CAGCTTTGTA 
AGACTTCAAT 
GTTCCCACCT 
CTACCTTGAG 
AATTGCGAAT 
TGGTGATTAC 
CTATCTGGCA 
TGAACTGGAG 
GAAAATGATG 
ATCTATATAA 
TTAACATTGA 
CCTCTTAGGT 
CCTACCTATA 
TAAATGCAAT 
AAAAATTAGT 
AAAAGGTTAA 
AGTATAAATT 
TTTTTGATGG 
GACTTCTTGC 
TTAAAAGTTT 
AAAGGTTAAA 



CGCCGGCCCC 
GTGCAGCCCG 
GACACTCATT 
CAGCAGCGCC 
AGGCACGGCC 
GCTCTGTGOG 
GGGGCGCCCG 
TTCGAGTACC 
ACCGCCATCA 
ATCGAGCTGT 
GGGAATATGC 
CTATGCAACG 
ATTCACATCA 
GAACTCAAGG 
TTTCCAGACC 
CTGTTGAAAA 
GCTGTCATTC 
GACCTTGTGG 
TCCTCCCCAG 
GCCATGTCTG 
GATGGAACCA 
TACCTTAGCA 
GAAGAGACTC 
CAGATACACC 
GCCACCATCT 
TGGAGATTGC 
ATAACAAAGA 
TCATTTTCTG 
TCAGAAACTT 
TGTAGTATGA 
TTTATTTTTT 
AAAAATATAA 
TTACACAAAA 
ATTCCTGGTA 
GAAGTTCTTT 
CAGATACAGC 
GTCGTTTTTT 
GAAGAAAAGG 
TTGTACATAT 
AGGGTTTTCT 
AAAAAATCCC 



CTGCCTCGCA 
TGGAGCCGCG 
CAGCCGGGGA 
GGCGGGCTAA 
GGCGGCGGCG 
GGGCACTGGC 
CGGCGGGCAT 
ACCGCTACCC 
GCAGGATTTA 
CCGACAACCC 
ATGGGAATGA 
AATACCAGAA 
TGCCTTCCCT 
ACTGGTTTGT 
TGGATAGGAT 
ATATGAAGAA 
ATTGGATTAT 
CCAATTATCC 
ATGACGCCAT 
ACCCCAATCG 
CCAACGGTGG 
GCAACTGTTT 
TGAAGACCTA 
GAGGAGTTAA 
CCGTGGAAGG 
TTATACCTGG 
AAGTGGCAGT 
AAAGGAAAGA 
TAAATTTTTA 
TGTAATGTGG 
AATCATTTAA 
GAACTTGATA 
AAGTATAGAA 
TTATTTACAA 
TACTGTAATT 
TCGGAGTTGT 
TCTTGTGCTG 
TACATGTTTA 
AGGAGCAATA 
CTTGGTTGTA 
CAGTGAAAAA 



GTGGTTTCTC 
GCTTTGCCCG 
AGGTGAGGCG 
GCCCAGGGCC 
GAGCGCAGCG 
TGCCTGCGGG 
GAGGCGGCGC 
CGAGCTGCGC 
CACGGTGGGG 
TGGCGTCCAT 
GGCTGTTGGA 
GGGGAACGAG 
GAACCCAGAT 
GGGTCGAAGC 
AGTGTACGTG 
AATTGTGGAT 
GGATATTCCT 
ATATGATGAG 
TTTCCAAAGC 
GCCACCATGT 
TGCTTGGTAC 
TGAGATCACC 
CTGGGAGGAT 
AGGATTTGTC 
AATAGACCAC 
AAACTATAAA 
TCCTTACAGC 
AGAGGAGAAG 
AAAAGGCTTC 
TCTTTTTTTT 
ATATTAATCA 
TATTTCATTC 
AAGATTTAAG 
TGCAGAATTT 
GGTGACAATG 
GAGCACTCTA 
ACTAACTATA 
CAAAGAGGTT 
CTATTATATT 
GAGTGGCCCA 
AAA 



CTGCAGCTCC 
TCTCCTCTGG 
AGTAGAGGCT 
GGGCAGACAA 
ATGGCCGGGC 

CGGCGGCTGC 
GAGGCGCTCG 
CGCAGCTTCG 
GAGCCTGGTG 
CGAGAACTGC 
ACAATTGTCA 
GGCTTTGAGA 
AATGCCCAGG 
AATGAGAAAG 
CAAAACACAA 
TTTGTGCTTT 
ACGCGGAGTG 
TTGGCCCGGG 
CGCAAGAATG 
AGCGTACCTG 
GTGGAGCTTA 
AACAAAAACT 
CGAGACCTTC 
GATGTTACAT 
CTTACAGCCT 
CCTGCTGCTG 
GAAGAATTGA 
TAGTTAGCTG 
AGATTTTGTG 
ACTTTCCTTA 
TCTTATATAG 
TAATTTTGCC 
TTTGAGTAAT 
TCACATAATG 
CTGCAAGACT 
AGCATGATCT 
TTATGAAAAG 
ATGTAGTCCG 
GAATTGCATT 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
16B0 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



Protein Access ioo#: 



$EQ (D WQ;245 PBQ8 Pr fle tn sequence 
P16870 



MAGRGGSALL ALCGALAACG WLLGAEAQEP GAPAAGMRRR RRUQQEDGIS FEYHRYPELR 60 
EALVSVWLQC TAISRJYTVG RSFEGRELLV IELSDNPGVH EPGEPEFKY1 GNMHGNEAVG 120 
RELUFLAQY LCNEYQKGNE TTVNUHSTR IHIMPSLNPD GFEKAASQPG ELKDWFVGRS 180 
NAQGIDLNRN FPDLDRIVYV NEKEGGPNNH UJCNMKKTVDQNTKLAPETK AVIHWIMDIP 240 
FVLSANLHGG DLVANYPYDE TRSGSAHEYS SSPDDAIFQS LARAYSSFNP AMSDPNRPPC 300 
RKNDDDSSFV DGTTNGGAWY SVPGGMQDFN YLSSNCFETT VELSCEKFPP EETLKTYWED 360 
NKNSLLSYLE QMRGVKGFV RDLQGNPIAN ATTSVEGIDH DVTSAKDGDY WRLUPGNYK 420 
LTASAPGYLA ITKKVAVPYS PAAGVDFELE SFSERKEEEK EELMEWWKMM SETLNF 



Nuddc Add Accession!: AF038966 

Coring sequence: 



91- 1 107 {underlined sequence corresponds to start and stop codon) 



l 

I 

GGGGCGACGT 
GTCGGGTGGG 
GACCCGGATC 
CCACCAGGAC 
GTGAAGATGC 
CCAGCTTATA 
CAAGAAGAAC 
CTCAGTCAAC 
CCTTGTTTCT 
CTTATGTACT 
TTGGCTTGGT 
TTCTTGCTTT 
AGGAGTGACA 
GTACATGTAC 
CTTACTGGTC 
TTCACAGCAT 
ACAACAGGTG 
AAAACTGTCC 



11 
I 

GAGCGCGCAG 
TGACGCCGAG 
TCAACAATCC 
TTGATGAATA 
CTAATGTACC 
CACAGATTGC 
TAGAAAGAAA 
ATGGTAGAAA 
ATCAGGAATT 
ACTTGTGGAT 
TTTGTGTTGA 
TTACTCCTTG 
GTTCATTTAG 
TCCAAGCTGC 
TCAACCAAAA 
CAGCAGTCAT 
CTAGTTTTGA 
AGACCGCAGC 



21 
I 

GGCGGCGGCG 
AGCCAGAGAG 
CTTCAAGGAT 
TAATCCATTC 
CAATACACAA 
AAAGGAACAT 
AGCCGCAGAA 
AAATATTTGG 
TTCTGTAGAC 
GTTCCATGCA 
TTCTGCAAGA 
TTCATTTGTC 
ATTCTTTGTA 
AGGATTTCAT 
TATTCCTGTT 
CTCACTAGTT 
GAAGGCCCAA 
TGCAAATGCA 



31 
I 

GCCTCGCCTC 
ATG TCGGATT 
CCATCAGTTA 
TCGGATTCTA 
CCAGCAATAA 
GCATTGGCCC 
TTAGATCGTC 
CCACCTCTTC 
ATTCCTGTAG 
GTAACACTGT 
GCGGTTGATT 
TGTTGGTACA 
TTCTTCTTCG 
AACTCGGGCA 
GGAATCATGA 
ATGTTCAAAA 
CAGGAGTTTG 
GCTTCAACTG 



41 
I 

GTCTCTCTCT 
TCGACAGTAA 
CACAAGTGAC 
GAACACCTCC 
TGAAACCAAC 
AAGCTGAACT 
GGGAACGAGA 
CTAGCAATTT 
AATTCCAAAA 
TTCTAAATAT 
TTGGATTGAG 
GACCACTTTA 
TCTATATTTG 
ATrGTGGTTG 
TGATAATCAT 
AAGTACATGG 
CAACAGGTGT 
CAGCATCTAG 



51 
I 

CTGCGCCTGG 
CCCGTTTGCC 
AAGAAATGTT 
ACCAGGCGGT 
AGAGGAACAT 
TCTTAAGCGC 
AATGCAAAAC 
TCCTGTCGGA 
GACAGTAAAG 
CTTCGGATGC 
TATCCTGTGG 
TGGAGCTTTC 
TCAGTTTGCT 
GATTTCATCC 
AGCAGCACTT 
ACTATATCGC 
GATGTCCAAC 
TGCAGCTCAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
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AATGCTTTCA AGGGTAACCA GATTTAAGAA TCTTCAAACA ATACACTGTT ACCTTTTGAC 1140 

TGTACCTTTT TCTCCAGTTA CTGTATTCTA CAAATATTTT TATGTTCAAA ACACACAGTA 1200 

CAGACAGCAT GGATATTTCC TGTTCACTTG TGCATGGGCT AAAACCAGGA AAACTTCCTT 1260 

GTCTTATTAC TTTACCTAAT AGTTTCTTAA TATTTCAGTG CCCCTTGCAG AAAAAATATT 1320 

ACATGCTAAA TAAATATTCT CCATATTTTT GGGGGATGAC ATTCAGTGAA TTATTTCAGT 1380 

GGTGACCCAC TGAAAATTAA TAATGGTACT TATGATTAAA AACGCATTTA ATACTAACTG 1440 

CAGTAGTTCT TTCAAGAATC TTTAGAGATA AGGATTGCAC ATTGGAAAAG TAAACCATGT 1500 

TTCATTCCTT TTTCCCTATT TATATTGAAA GAAATAGGCC AGCAGAGACT TAGGGATTTT 1560 

AAATTGGCTT GCTTTTTAGC TGTTTCAGTC ACCAGTGAAG AGCCTATGTG CATTTTGTAG 1620 

TAGATAATGT AAAATTTGTC ATCTTTTTCT TTTCTTTTTT TTAGAATAGC TGATATTTTG 1680 

ATAACAATCT CTAATTTGCA TGGGCACCAC ATTTCTTATA TTAAAAGAAT TAGTGTTTTG 1740 

GCTTCTGTAC TGCTTATGGT TGTAGGATTC AGGGGTTAAT GGAATCACAG AAATGATATT 1800 

CTGCAAGAAT TTCTTTTAAA TAAAAAGTTT GGGGGTGCAA TATAAGAAGT TTATATAATA 1860 

TGCAGTACAT TATCCAAAAG AGAAGGTAGT TAATGCAGTA GAAAGTAGTG GTAATAATTC 1920 
CTTTTT 



SEQ ID NO: 247 PBY4 Protein sequence: 

Protein Accession #: 

MSDFDSNPFA DPDLNNPFKD PSVTQVTRNV PPGLDEYNPF SDSRTPPPGG VKMPNVPNTQ 60 
PAIMKPTEEH PAYTQIAKEH ALAQAELLKR QEELERKAAE LDRREREMQN LSQHGRKNIW 120 
PPLPSNFPVG PCFYQEFS VD IPVEFQKTVK LMYYLWMFHA VTLFLNLFGC LA WFC VDS AR 1 80 
AVDFGLSILW FLLFTPCSFV CWYRPLYGAF RSDSSFRFFV FFFVYICQFA VHVLQAAGFH 240 
NWGNCGWISS LTGLNQNIPV GIMMItlAAL FTASAVISLV MFKKVHGLYR TTGASFEKAQ 300 
QEFATGVMSN KTVQTAAANA ASTAASSAAQ NAFKGNQI 



SEQ ID NO:248 P BH2 DNA sequence 

Nucleic Acid Accession*: none found 

Coding sequence: 1*613 (underlined sequence corresponds to start and stop codon) 



ATGAGAGACA ATAAATCGTG TGCTTTTTTC ATGGGAAAGT TAAATGTTTG TTTTG A AGGC 60 
ACAGTAATAG CAGGCTATTC AGTGTTTGCC ACTACCTGCA TCATTCATCT GGCTGTAGCT 1 20 
AGTGCACTAC AATTTCCTAA AAAGTCTTCT CACCCTCACA GGACTGCTCT ACATCTGGCC 180 
TCTGCCAATG GAAATTCAGA AGTAGTAAAA CTCCTGCTGG ACAGACGATG TCAACTTAAT 240 
ATCCTTGACA ACAAAAAGAG GACAGCTCTG ACAAAGGCCG TACAATGCCA GGAAGATGAA 300 
TGTGCGTTAA TGTTGCTGGA ACATGGCACT GATCCGAATA TTCCAGATGA GTATGGAAAT 360 
ACCGCTCTAC ACTATGCTAT CTACAATGA A GATAAATTAA TGGCCAAAGC ACTGCTCTTA 420 
TACGGTGCTG ATATCGAATC A A A AA AC AAG CATGGCCTCA CACCACTGTT ACTTGGTGTA 480 
CATGAGCAAA AACAGCAAGT GGTGAAATTT TTAATCAAGA AAAAAGCAAA TTTAAATGCA 540 
CTGGATAGAT ATGGAAGGTG TGTGACCTTG GGAACGTTAT TTACCACCAA ATATGTTGTC 600 
ATATATGAAA AGTAG 



SEQ tD NQ:249 PBH2 Protein sequence: 
Protein Accessions: none found 

MRDNKSCAFF MGKLNVCFEG TV1AGYSVFA TTCllULAVA SALQFPKKSS HPHRTALHLA 60 
SANGNSEWK LLLDRRCQLN ILDNKKRTAL TKAVQCQEDE CALMLLEHGT DPNIPDEYGN 120 
TALHYAIYNE DKLMAKALLL YGADIESKNK HGLTPLLLG V HEQKQQVVKF UKKKANLNA 180 
LDR YGRCVTL GTLFTTKYW IYEK 



SEQIDNO:250PB J1 DNA sequence 
Nucleic Acid Accessions: XM.005829 

Coding sequence: 1-3043 (underlined sequence corresponds to start and stop codon) 

ATGGTGATCA TCTATCTTTC TTTCTGCAAT TATTACATGG AGTTCTACAG AGAAGAGCTT 60 
CCCCACATTG ACTATTTG AT TG ACATTCAG TTTGCAACAG GAAAGGTTAC TCAGCCGGGA 1 20 
GAGGACACTT CCTACCATCA ATGCGCTCAG CTTG AAGCC A G AG ACGAAGG CACCG ACAGT 1 80 
TTATTATTAA ACAATGGCAG CAGCGCCACG CTGAAGACAC GAACGCGCTG TTATGGAACC 240 
CCCAGAGGTC TCCCCCATCG TAGCCTGCTC CAGCCGACTC CGCCCACATG TAAA ACGAAG 300 
ATCAGG AGCA GATTTGAAGA ATTAC AA AGT GAATTGGTGC CAGTCAGCAT GTC AGAGACA 360 
GACCACATAG CCTCTACTTC CTCTGATA AA A ATGTTGGGA AAACACCTGA ATTAAAGGAA 420 
GACTCATGCA ACTTGTTTTC TGGCAATGAA AGCAGCA AAT TAGAAAATGA GTCCA AACTA 480 
TTGTCATTAA ACACTGATAA AACTTTATGT CAACCTAATG AGCATAATAA TCGAATTGAA 540 
GCCCAGGAAA ATTATATTCC AGATCATGGT GGAGGTGAGG ATTCTTGTGC CAA AACAGAC 600 
ACAGGCTCAG AAAATTCTGA ACAAATAGCT AATTTTCCTA GTGG A AATTT TGCTAAACAT 660 
ATTTCAAAAA CAAATGAAAC AGA AC AG A AA GTAACACAAA TATTGGTGGA ATTAAGGTCA 720 
TCTACATTTC CAGAATCAGC TAATGAAAAG ACTTATTCAG AAAGCCCCTA TGATACAGAC 780 
TGCACCAAGA AATTTATTTC AAAAATAAAG AGCGTTTCAG CATCAGAGGA TTTGTTGGAA 840 
GAAATAGAAT CTG AGCTCTT ATCTACGG AG TTTGCAGAAC ATCG AGTACC AAATGGAATG 900 
AATAAGGGAG AACATGCATT AGTTCTGTTT GAAAAGTGTG TGCAAGATAA ATATTTGCAG 960 
CAGGAACATA TCATAAAAAA GTTAATTAA A GAAAATAAGA AGCATCAGGA GCTCTTCGTA 1020 
G ACATTTGTT CAGAAAAAG A CAATTTAAG A GAAGAACTAA AGAAAAGAAC AG A AACTG AG 1080 
A AGCAGCATA TG AACACAAT TAAACAGTTA G A ATCAAG AA TAGAAG A ACT TAATAAAG AA 1 1 40 
GTTAAAGCTT CCAGAG ATCA ACTAATAGCT CAAGACGTTA CAGCTAAAAA TGC AGTTCAG 1 200 
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CAGTTACACA AAGAGATGGC CCAACGGATG GAACAGGCCA ACAAGAAATG TGAAGAGGCA 1260 
CGCCAAGAAA AAGAAGCAAT GGTAATGAAA TATGTAAGAG GTGAGAAGGA ATCTTTAGAT 1320 
CTTCGAAAGG AAAAAGAGAC ACTTGAGAAA AAACTTAGAG ATGCAAATAA GGAACTTGAG 1380 
AAAAACACTA ACAAAATTAA GCAGCTTTCT CAGGAGAAAG GACGGTTGCA CCAGCTGTAT 1440 
GAAACTAAGG AAGGCGAAAC GACTAGACTC ATCAGAGAAA TAGACAAATT AAAGGAAGAC 1500 
ATTAACTCTC ACGTCATCAA AGTAAAGTGG GCACAAAACA AATTAAAAGC TGAAATGGAT 1560 
TCACACAAGG AAACCAAAGA TAAACTCAAA GAAACAACAA CAAAATTAAC ACAAGCAAAG 1620 
GAAGAAGCAG ATCAGATACG AAAAAACTGT CAGGATATGA TAAAAACATA TCAGGAGTCA 1680 
GAAGAAATTA AATCAAATGA GCTTGATGCA AAGCTTAGAG TCACAAAAGG AGAACTTGAA 1740 
AAACAAATGC AAGAAAAATC TGACCAGCTA GAGATGCATC ATGCCAAAAT AAAGGAACTA 1800 
GAAGATCTGA AG AG AACATT TAAGG AGGGT ATGGATGAGT TAAGAACACT GAGAACA AAG 1 860 
GTGAAATGTC TAGAAGATGA ACGATTAAGA ACAGAAGATG AA1TATCA AA ATATAAGGAA 1920 
ATTATTAATC GCCAAAAAGC TGAAATTCAG AATTTATTGG ACAAGGTGAA AACTGCAGAT 1980 
CAGCTACAGG AGCAGCTTCA AAGAGGTAAG CAAGAAATTG AAAATTTGAA AGAAGAAGTG 2040 
GAAAGTCTTA ATTCTTTGAT TAATGACCTA CAAAAAGACA TCGAAGGCAG TAGGAAAAGA 2100 
GAATCTGAGC TGCTGCTGTT TACAGAAAGG CTCACTAGTA AGAATGCACA GCTTCAGTCT 2160 
GAATCCAATT CTTTGCAGTC ACAATTTGAT AAAGTTTCCT GTAGTGAAAG TCAGTTACAA 2220 
AGCCAGTGTG AACAAATGAA ACAGAGAAAT ATTAATTTGG AAAGTAGGTT GTTGAAAGAG 2280 
GAAGAACTGC GAAAAG AGGA AGTCCAAACT CTGCAAGCTG AACTCGCTTG TAG ACAAACA 2340 
GAAGTTAAAG CATTGAGTAC CCAGGTAGAA GAATTAAAAG ATGAGTTAGT AACTCAGAG A 2400 
CGTAAACATG CCTCTAGTAT CAAGG ATCTC ACCAAACAAC TTCAGCAAGC ACGAAGAAAA 2460 
TTAGATCAGG TTGAGAGTGG AAGCTATGAC AAAGAAGTCA GCAGCATGGG AAGTCGTTCT 2520 
AGTTCATCAG GGTCCCTGAA TGCTCGAAGC AGTGCAGAAG ATCGATCTCC AGAAAATACT 2580 
GGGTCCTCAG TAGCTGTGG A TAACTTTCCA CAAGTAGATA AGGCCATGTT GATTGAGAG A 2640 
ATAGTTAGGC TGCAAAAAGC ACATGCCCGG AAAAATG AAA AGATAGAATT TATGGAGGAC 2700 
CACATCAAAC AACTGGTGGA AGAAATTAGG AAAAAAACAA AAATAATTCA AAGTTATATT 2760 
TTACGAGAAG AATCAGGCAC ACTTTCTTCA GAGGCATCTG ATTTTAACAA AGTTCATTTA 2820 
AGTAGACGGG GTGGCATCAT GGCATCTTTA TATACATCCC ATCCAGCTGA CAATGGATTA 2880 
ACATTGGACC TCTCTTTGGA AATCAACCGA AAATTACAGG CTGTTTTGGA GGATACG7TA 2940 
CTAAAAAATA TTACTTTGAA GGAAAATCTA CAAACACTTG GAACAGAAAT AGAACGTCTT 3000 
ATTAAACACC AGCATGAACT AGAACAGAGG ACAAAGAAAA CCTAAAACAA GCCTCTTGCT 3060 
CAGTAAAGAG ACAAAAGCCA CACAGGAGTA GGTGCCACTG ACCTCTATTG TTGGAGACTT 3120 
TGTTCCACTT TTTGTTTCAG CCAGTAAAAA TATTGTTTTG CTTCATCTGT ACACAAAAAA 3180 
ATACCCTTTT AC AATATG AA TGCATTGCTG TATATACTGT AAGACTGAAA GCTTTGATGA 3240 
AATTTGTTTT TGTATGGTGC AATATG ACAG CCTGTCATTG AATCTAAACA ACTTAATTTG 3300 
CTTGTATTCA TA AGAAGTGT TGAACATTAC AAGGGCTTTT AT 



SEQ|DffQ;2§1 PRJ1 PfgttfqKWTO; 
Protein Accession #: NP_060487 



MVIIYLSPCN YYMEFYREEL PHIDYUDIQ FATGKVTQPG EDTS YHQCAQ LEARDEGTDS 60 
LIXNNGSSAT LKTRTRCYGT PRGLPHRSIX QPTPPTCKTK IRSRFEELQS ELVPVSMSET 120 
DH1ASTSSDK NVGKTPELKE DSCNLFSGNE SS KLENESKL LSLNTDKTLC QPNEHNNRIE 1 80 
AQENYIPDHG GGEDSCAKTD TGSENSEQIA NFPSGNFAKH 1SKTNETEQK VTQILVELRS 240 
STFPESANEK TYSESPYDTD CTKKF1SKIK SVSASEDLLE EIESELLSTE FAEHRVPNGM 300 
NKGEHALVLF EKCVQDKYLQ QEHUKKLIK ENKKHQELFV DICSEKDNLR EELKKRTETE 360 
KQHMNTIKQL ESRIEELNKE VKASRDQLIA QDVTAKNAVQ QLHKEMAQRM EQANKKCEEA 420 
RQEKEAMVMK YVRGEKESLD LRKEKETLEK KLRDANKELE KNTNK1KQLS QEKGRLHQLY 480 
ETKEGETTRL IRFJDKLKED INSHVIKVKW AQNKLKAEMD SHKETKDKLK ETTTKLTQAK 540 
EEADQIRKNC QDM1KTYQES EEIKSNELDA KLRVTKGELE KQMQEKSDQL EMHHAKKEL 600 
EDLKRTFKEG MDELRTLRTK VKCLEDERLR TEDELSKYKE HNRQKAEIQ NULDKVKTAD 660 
QLQEQLQRGK QEIENLKEEV ESLNSUNDL QKDEGSRKR ESELLLFTER LTSKNAQLQS 720 
ESNSLQSQFD KVSCSESQLQ SQCEQMKQTN INLESRLLKE EELRKEEVQT LQAELACRQT 780 
EVKALSTQVE ELKDELVTQR RKHASS1KDL TKQLQQARRK LDQVESGS YD KEVSSMGSRS 840 
SSSGSLN ARS SAEDRSPENT GSSVAVDNFP QVDKAMUER IVRLQKAHAR KNEKIEFMED 900 
H1KQLVEEIR KKTKHQSYI LREESGTLSS EASDFNKVHL SRRGGIMASL YTSHPADNGL 960 
TLELSLEINR KLQAVLEDTL LKN1TLKENL QTLGTEDERL 1KHQHELEQR TKKT 



SEOI0NO:2S2PBJ6PNAseouence 
Nucleic Add Accession*: D83750 

Coding sequence: 56-1459 {underfJned sequence corresponds to start and slop codon) 

1 11 21 31 41 51 

I { I III 

TTGCCGTGAA GGGCTGTGCG GTTCCCGTGC GCGCCGGAGC CTGCTGTGGC CTCTTATGCA 60 

CTCCACCACC CCCATCAGCT CCCTCTTCTC CTTCACCAGC CCCGCAGTGA AGAGACTCCT 120 

AGGCTGGAAG CAAGGAGATG AAGAGGAAAA GTGGGCAGAG AAGGCAGTGG ACTCTCTAGT 180 

GAAGAAGTTA AAGAAGAAGA AGGGAGCCAT GGACGAGCTG GAGAGGGCTC TCAGCTGCCC 240 

GGGGCAGCCC AGCAAATGCG TCACGATTCC CCGCTCCCTG GACGGGCGGC TGCAGGTGTC 300 

CCACCGCAAG GGCCTGCCCC ATGTGATTTA CTGTCGCGTG TGGCGCTGGC CGGATCTGCA 360 

GTCCCACCAC GAGCTGAAGC CGCTGGAGTG CTGTGAGTTC CCATTTGGCT CCAAGCAGAA 420 

AGAAGTGTGC ATTAACCCTT ACCACTACCG CCGGGTGGAG ACTCCAGTAC TGCCTCCTGT 480 

GCTCGTGCCA AGACACAGTG AATATAACCC CCAGCTCAGC CTCCTGGCCA AGTTCCGCAG 540 

CGCCTCCCTG CACAGTGAGC CACTCATGCC ACACAACGCC ACCTATCCTG ACTCTTTCCA 600 

GCAGCCTCCG TGCTCTGCAC TCCCTCCCTC ACCCAGCCAC GCGTTCTCCC AGTCCCCGTG 660 

CACGGCCAGC TACCCTCACT CCCCAGGAAG TCCTTCTGAG CCAGAGAGTC CCTATCAACA 720 

CTCAGTTGAC ACACCACCCC TGCCTTATCA TGCCACAGAA GCCTCTGAGA CCCAGAGTGG 780 
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CCAACCTGTA GATGCCACAG CTGATAGACA TGTAGTGCTA TCGATACCAA ATGGAGACTT 840 

TCGACCAGTT TGTTACGAGG AGCCCCAGCA CTGGTGCTCG GTCGCCTACT ATGAACTGAA 900 

CAACCGAGTT GGGGAGACAT TCCAGGCTTC CTCCCGAAGT GTGCTCATAG ATGGGTTCAC 960 

CGACCCTTCA AATAACAGGA ACAGATTCTG TCTTGGACTT CTTTCTAATG TAAACAGAAA 1020 

CTCAACGATA GAAAATACCA GGAGACATAT AGGAAAGGGT GTGCACTTGT ACTACGTCGG 1080 

GGGAGAGGTG TATGCCGAGT GCGTGAGTGA CAGCAGCATC TTTGTGCAGA GCCGGAACTG 1140 

CAACTATCAA CACGGCTTCC ACCCAGCTAC CGTCTGCAAG ATCCCCAGCG GCTGCAGCCT 1200 

CAAGGTCTTC AACAACCAGC TCTTCGCTCA GCTCCTGGCC CAGTCAGTTC ACCACGGCTT 1260 

TGAAGTCGTG TATGAACTGA CCAAGATGTG TACTATCCGG ATGAGTTTTG TTAAGGGTTG 1320 

GGGTGCTGAG TATCATCGCC AGGATGTCAC CAGCACCCCC TGCTGGATTG AGATTCATCT 1380 

TCATGGGCCA CTGCAGTGGC TGGACAAAGT TCTGACTCAG ATGGGCTCTC CACATAACCC 1440 
CATTTCTTCA GTGTCTTAAC AGTCATGTCT TAAGCTGCAT TTCCATAGGA T 



SEQ ID NO:253 PBJ6 Protein sequence: 
Protein Accession #: NPJH5898 

MHSTTPLSSL FSFTSPAVKR LLGWKQGDEE EKWAEKAVDS LVKKLKKKKG AMDELERALS 60 
CPGQPSKCVT IPRSLDGRLQ VSHRKGLPHV IYCRVWRWPD LQSHHELKPL ECCEFPFGSK 120 
QKEVCINPYH YRRVETPVLP PVLVPRHSEY NPQLSLLAKF RSASLHSEPL MPHNATYPDS 180 
FQQPPCSALP PSPSHAFSQS PCTASYPHSP GSPSEPESPY QHSVDTPPLP YHATEASETQ 240 
SGQPVDATAD RHWLSIPNG DFRPVCYEEP QHWCSVAYYE LNNRVGETFQ ASSRSVUDG 300 
FTDPSNNRNR PCLGLLSNVN RNSTTENTRR HIGKGVHLYY VGGEVYAECV SDSSIFVQSR 360 
NCNYQHGFHP ATVCKIPSGC SLKVFNNQLF AQLLAQSVHH GFEWYELTK MCTTRMSFVK 420 
GWGAEYHRQD VTSTPCWIEI HLHGPLQWLD KVLTQMGSPH NPISSVS 



$EQ|DN^2$4PW9PNAseqy?nc8 
Nucleic Acid Accession*: AB04684 

Cotfing sequence: 472-4377 {underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

TGCAGGTTTG CAGGGTCTGA GATTACTTGG GCTTTTCCTG CCTTTTTCTT TTGCTTAAGG 60 

GATGGACAAG GAGCTGAGAT TTATGACCCT TATTAGAGAA AAAAATGTGC CTTGCTAGGG 120 

TGGGGACACT TGGTTGATGC AGTCTCTCTC TCTCTTTCTC GGTGTTTATA ACAAAACAAA 180 

ACCAAAATGA ACTGAGGGGT TTGTAATGGT AGTTTGTTTG TTGCTGGAGA ATGCTACTTT 240 

GCATGCTTTT TTTCTCTTGC AGGGTATGTT CTGTCTTGTG CTTTTTCTTT TAGAAGCTAC 300 

TAAAGGGTGT TGGGGATGCT TCTGACTATT ATGAAGGCCA AAAGGCCTGT TGACTGGGGC 360 

TGCTTTTAAC CCTTTCCTAT TTGCTGAGAA TGCAGCCGTG TGACAGTAAC TGAACATTGG 420 

TCTAAAGTCT TTCCAAAAGG TCAAGGTTCA CAAGAACATC TGCTCAAATT AATGACCATG 480 

GGGGATATGA AGACCCCAGA CTTTGATGAC CTCCTGGCAG CATTTGACAT CCCAGATATG 540 

GTCGATCCTA AAGCAGCTAT TGAGTCTGGA CACGATGACC ATGAAAGCCA CATGAAGCAG 600 

AATGCTCACG GAGAGGATGA CTCCCACGCA CCATCATCTT CTGATGTGGG TGTCAGCGTT 660 

ATCGTCAAGA ATGTTCGGAA CATTGACTCT TCCGAGGGCG GGGAGAAAGA CGGCCACAAC 720 

CCCACTGGCA ATGGCTTACA TAATGGGTTT CTCACAGCAT CCTCCCTTGA CAGTTACAGT 780 

AAAGATGGAG CAAAGTCCTT GAAAGGAGAT GTGCCTGCCT CTGAGGTGAC ACTGAAAGAC 840 

TCGACATTCA GCCAGTTTAG CCCGATCTCC AGTGCTGAAG AGTTTGATGA CGACGAGAAG 900 

ATTGAGGTGG ATGACCCCCC TGACAAGGAG GACATGCGAT CAAGCTTCAG GTCGAATGTG 960 

TTGACGGGGT CGGCTCCCCA GCAGGACTAC GATAAGCTGA AGGCACTCGG AGGGGAAAAC 1020 

TCCAGCAAAA CTGGACTCTC TACGTCAGGC AATGTGGAGA AAAACAAAGC TGTTAAGAGA 1080 

GAAACAGAAG CCAGTTCTAT AAACCTGAGT GTTTATGAAC CTTTTAAAGT CAGAAAAGCA 1140 

GAGGATAAAT TGAAGGAAAG CTCTGACAAG GTGCTGGAAA ACAGAGTCCT AGATGGGAAG 1200 

CTGAGCTCCG AGAAGAATGA CACCAGCCTC CCCAGCGTTG CGCCATCAAA GACAAAGTCG 1260 

TCCTCCAAGC TCTCGTCCTG CATCGCTGCC ATCGCGGCTC TCAGCGCTAA AAAGGCGGCT 1320 

TCAGACTCCT GCAAAGAACC AGTGGCCAAT TCGAGGGAAT CCTCCCCGTT ACCAAAAGAA 1380 

GTAAATGACA GTCCGAGAGC CGCTGACAAG TCTCCTGAAT CCCAGAATCT CATCGACGGG 1440 

ACCAAAAAAC CATCCCTGAA GCAACCGGAT AGTCCCAGAA GCATCTCAAG TGAGAACAGC 1500 

AGCAAAGGAT CCCCGTCCTC TCCCGCAGGG TCCACACCAG CAATCCCCAA AGTCCGCATA 1560 

AAAACCATTA AGACATCTTC TGGGGAAATC AAGAGAACAG TGACCAGGGT ATTGCCAGAA 1620 

GTGGATCTTG ACTCTGGAAA GAAACCTTCC GAGCAGACAG CGTCCGTGAT GGCCTCTGTG 1680 

ACATCCCTTC TGTCGTCTCC AGCATCAGCC GCCGTCCTTT CCTCTCCCCC CAGGGCGCCT 1740 

CTCCAGTCTG CGGTCGTGAC CAATGCAGTT TCCCCTGCAG AGCTCACCCC CAAACAGGTC 1800 

ACAATCAAGC CTGTGGCTAC TGCTTTCCTC CCAGTGTCTG CTGTGAAGAC GGCAGGATCC 1860 

CAAGTCATTA ATTTGAAGCT CGCTAACAAC ACCACGGTGA AAGCCACGGT CATATCTGCT 1920 

GCCTCTGTCC AGAGTGCCAG CAGCGCCATC ATTAAAGCTG CCAACGCCAT CCAGCAGCAA 1980 

ACTGTCGTGG TGCCGGCATC CAGCCTGGCC AATGCCAAAC TCGTGCCAAA GACTGTGCAC 2040 

CTTGCCAACC TTAACCTTTT GCCTCAGGGT GCCCAGGCCA CCTCTGAACT CCGCCAAGTG 2100 

CTAACCAAAC CTCAGCAACA AATAAAGCAG GCAATAATCA ATGCAGCAGC CTCGCAACCC 2160 

CCCAAAAAGG TGTCTCGAGT CCAGGTGGTG TCGTCCTTGC AGAGTTCTGT GGTGGAAGCT 2220 

TTCAACAAGG TGCTGAGCAG TGTCAATCCA GTCCCTGTTT ACATCCCAAA CCTCAGTCCT 2280 

CCCGCCAATG CAGGGATCAC GTTACCGACG CGTGGGTACA AGTGCTTGGA GTGTGGGGAC 2340 

TCCTTTGCAC TTGAAAAGAG TCTGACCCAG CACTACGACA GACGGAGCGT GCGCATCGAA 2400 

GTAACGTGCA ACCATTGTAC AAAGAACCTC GTTTTTTACA ACAAATCCAG CCTCCTTTCC 2460 

CATGCCCGTG GGCATAAGGA GAAAGGGGTG GTAATGCAAT GCTCCCACTT AATTTTAAAG 2520 

CCAGTCCCAG CAGATCAAAT GATAGTTTCT CCGTCAAGCA ATACTTCCAC TTCAACTTCC 2580 

ACTCTTCAGA GCCCTGTGGG AGCTGGCACA CACACTGTCA CAAAAATTCA GTCTGGCATA 2640 

ACTGGGACAG TCATATCGGC TCCTTCAAGC ACTCCCATCA CCCCAGCCAT GCCCCTAGAT 2700 

GAAGACCCCT CCAAACTGTG TAGACATAGT CTAAAATGTT TGGAGTGTAA TGAAGTCTTC 2760 

CAGGACGAGA CATCACTGGC TACACATTTC CAGCAGGCTG CAGATACGAG TGGACAAAAG 2820 
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ACTTGCACTA TCTGCCAGAT GCTGCTTCCT AACCAGTGCA GTTATGCATC ACACCAGAGA 2880 

ATCCATCAGC ACAAATCTCC CTACACCTGC CCTGAGTGTG GGGCCATCTG CAGGTCGGTG 2940 

CACTTCCAGA CCCACGTCAC CAAGAACTGT CTGCACTACA CGAGGAGAGT TGGTTTTCGA 3000 

TGTGTGCATT GCAATGTTGT GTACTCTGAT GTGGCTGCTC TGAAGTCTCA CATTCAAGGT 3060 

TCTCACTGTG AAGTCTTCTA CAAGTGTCCT ATTTGTCCAA TGGCGTTTAA GTCTGCCCCA 3120 

AGCACACATT CCCACGCCTA CACACAGCAT CCTGGCATCA AGATAGGAGA ACCAAAAATA 3180 

ATATATAAGT GTTCCATGTG CGACACTGTG TTCACCCTGC AAACCTTGCT GTATCGCCAC 3240 

TTTGACCAAC ACATTGAAAA CCAGAAGGTG TCTGTTTTCA AGTGTCCAGA CTGTTCTCTT 3300 

TTATATGCAC AGAAGCAACT TATGATGGAC CATATCAAGT CTATGCATGG AACATTGAAA 3360 

AGTATTGAAG GGCCTCCAAA CTTGGGTATA AACTTGCCTT TGAGCATTAA GCCTGCAACT 3420 

CAAAATTCAG CAAATCAGAA CAAAGAGGAC ACCAAATCCA TGAATGGGAA AGAGAAATTG 3480 

GAAAAGAAAT CTCCATCTCC TGTGAAAAAA TCAATGGAAA CCAAGAAAGT GGCCAGTCCT 3540 

GGGTGGACGT GTTGGGAGTG TGACTGCCTG TTCATGCAGA GAGATGTGTA CATATCCCAC 3600 

GTGAGGAAGG AGCACGGGAA GCAAATGAAG AAACACCCCT GCCGCCAGTG TGACAAGTCT 3660 

TTCAGCTCGT CCCACAGCCT GTGCCGGCAC AACCGGATCA AGCACAAAGG CATCAGGAAA 3720 

GTGTACGCCT GCTCGCACTG CCCAGACTCC AGACGTACCT TTACCAAACG TTTGATGCTG 3780 

GAGAAGCACG TCCAGCTGAT GCATGGCATC AAGGACCCTG ACCTGAAAGA AATGACAGAT 3840 

GCCACCAATG AGGAGGAAAC AGAAATAAAA GAAGACACTA AGGTCCCCAG TCCCAAGCGG 3900 

AAGTTGGAAG AACCAGTTCT GGAGTTCAGG CCTCCCCGAG GAGCAATCAC TCAACCACTG 3960 

AAAAAGCTGA AAATCAATGT TTTTAAGGTT CACAAGTGTG CCGTGTGTGG CTTCACCACC 4020 

GAAAACCTGC TGCAATTCCA CGAACACATC CCTCAGCACA AATCGGATGG TTCTTCCTAC 4080 

CAGTGCCGGG AGTGTGGCCT CTGCTACACG TCTCACGTCT CTCTGTCCAG GCACCTCTTC 4140 

ATCGTACACA AGTTAAAGGA ACCTCAGCCA GTGTCCAAGC AAAATGGGGC TGGGGAAGAT 4200 

AACCAACAGG AGAACAAACC CAGCCACGAG GATGAATCCC CTGATGGCGC CGTGTCAGAC 4260 

AGAAAGTGCA AAGTGTGCGC AAAAACTTTT GAAACTGAAG CTGCCTTAAA TACTCACATG 4320 

CGGACACACG GCATGGCCTT CATCAAATCC AAAAGGATGA GCTCAGCCGA GAAATAGCCA 4380 

CAGATGCTCC ATGAGGAAAA TCCCTGTCCA CATTGGAATA AAAAAGACAT TTTTGTTACA 4440 

AAGTTTGCAG TATAATAGAG TTAACAGTAC TGTCTAGGCT GTTGCAATAT ATTCTCTTTC 4500 

AATGTACCTT CCTTCACCTC GTCGTATATA TCCTCGATAA GTATTAAAAC AGTATTTGAG 4560 

TTTAAAAGAG TTTGTATATA TTTAAATGAA TAACTTTTTA TACTCTTTGT TACATGTTTG 4620 

TATCAGTATT TAGTGGAAAA CCATTTGAGT TGTTTTGGGT TAGAATTTTT CTTTTTGTAC 4680 

TGTTTCTTTA AAACAGAGTT CTTAGTAACA GGGGCAGTTC CTGAATTCAA ATAAACCATT 4740 

TTGTATGTTT GGATTTTGAA TGGGTTAACT AATTACAGGC TAAAATAATG CCTTTTTTAG 4800 

TGTTTTTAAT TTTTAGAATT CACTACATAA ATTGTAAGTA ATTGTGGGTC TCAAAAACAC 4860 

TAGGAACTTT TAAGTGTCTT AGCACTTCCT CGATGTGCCT GCCCTGAGGG AGTGAGTTCA 4920 

CATTTGAGAC AACTGCACTC CAGTGTGGAC GTGCCTTTGT CTTCAGGCCA TGCCGAAGGG 4980 

TGTTTAAAGC AGTCTTGCAG GTCGCTCCTT TCCCAGCCGT GGATAAAAAC TGAAGCTAGG 5040 

AATCTAATAA GGAATGCTGA TTTCCTCAGT TCCATTTTGA GGAATGGGGA AGGCTATTCT 5100 

AAAGAAAAAA ATGGGATTTG TTTTCTCGGC AGATCTGCAA GGCTGGCTTT AAGAGCACAA 5160 

GGAGGGAAAG TAACGAAAGG GCTGGACTAC TATAAAAGTT ACAAATACGT AGTTAGACCA 5220 

ATAGATTTAT ATAGTCAGGT TTTTGTCATG TAATTTATTA ACTAACTATT ACAGAAACAC 5280 

AGCTAAGAAT ATCAAGTATT TCTCTGGCTC TTGACAGAAA AAAATCAGTT GACTTAACCC 5340 

TTTGCTGTCA AAAGAGTTGG CGTTTCCTGT TCTGGGTGCT ACTGCCAAAC GTTATGGTAC 5400 

TTAGAGTCGG GATGCACAAC TTCAACCACC GACTTATCAA TGCAGCCGCC TGTGTATTGC 5460 

AATTGGCCGT TACCTTAAGC ACTGAGCCAC CCGGGTTTAG TTCAGCCATT TCAAGAAGTA 5520 

TATTTAACGT CGGTAGTTCT GCTTTATTAA AATGCAGCAG AGGTACTCTT CTGTCCCTTC 5580 

CGTTTATAGT TCTCTGAGAG AGTTCTATTT TTTGGTTTTG TTTTGTGTTT TCTTTTGCAT 5640 

TTTGTATCTT GTATTTATCC CTGAACATGT TTTGTACCTT TTTTTTTTTT TTTTTTTTAA 5700 

GAAAAGGAAT TCTTTTGTGT ATATATAGAT ACTTGCATGA TATACTGTAG TCAATGTTCG 5760 

GTTCCTCAAA AGGTCTTGCT GCTGTCAGGT GTTATGCACT CCATCCATCA TAACTGTATG 5820 
AAACACATTT CATATGTAAA TAAACGTGGG ACATTTG 



SEQ ID NM55 PBJ8 Protein seouencg 
Protein Accession •: BAB 13455 

MKTPDFDDLL AAFDIPDMVD PKAAIESGHD DHESHMKQNA HGEDDSHAPS SSDVGVSVIV 60 
KNVRNIDSSE GGEKDGHNPT GNGLHNGFLT ASSLDSYSKD GAKSLKGDVP ASEVTLKDST 120 
FSQESP1SSA EEFDDDEKIE VDDPPDKEDM RSSFRSNVLT GSAPQQDYDK LKALGGENSS 180 
KTGLSTSGNV EKNKAVKRET EASSINLS VY EPFKVRKAED KLKESSDKVL ENRVLDGKLS 240 
SEKNDTSLPS VAPSKTKSSS KLSSC1AA1A ALSAKKAASD SCKEPVANSR ESSPLPKEVN 300 
DSPRAADK5P ESQNUDGTK KPSLKQPDSP RSISSENSSK GSPSSPAGST PAIPKVRIKT 360 
IKTSSGEIKR TVTRVLPEVD LDSGKKPSEQ TASVMASVTS LLSSPASAAV LSSPPRAPLQ 420 
SA WTNAVSP AELTPKQVT1 KPVATAFLPV S A VKTAGSQV INLK1ANNTT VKATV1S AAS 480 
VQSASSAJIK AANAJQQQTV WPASSLANA KLVPKTVHLA NLNLLPQGAQ ATSELRQVLT 540 
KPQQQIKQAI INAAASQPPK KVSRVQVVSS LQSSVVEAFN KVLSSVNPVP VYIPNLSPPA 600 
NAGITLPTRG VKCLECGDSF ALEK5LTQHY DRRSVRIEVT CNHCTKNLVF YNKCSLLSHA 660 
RGHKEKGVVM QCSHULKPV PADQMIVSPS SNTSTSTSTL QSPVGAGTHT VTKIQSGITG 720 
TVJSAPSSTP ITPAMPLDED PSKLCRHSLK CLECNEVFQD ETSLATHPQQ AADTSGQKTC 780 
TICQMULPNQ CS VASHQR1H QHKSPYTCPE CGAICRSVHF OTH VTKNCLH YTRRVGFRCV 840 
HCNWYSDVA ALKSHIQGSH CEVFYKCPIC PMAPKSAPST HSHAYTQHPG IKIGEPKHY 900 
KCSMCDTVFT LQTLLYRHFD QHIENQKVSV FKCPDCSIXY AQKQLMMDHI KSMHGTUCSI 960 
EGPPNLGINL PLS1KPATQN SANQNKEDTK SMNGKEKLEK KSPSPVKKSM ETKKVASPGW 1020 
TCWECDCLFM QRDVYISHVR KEHGKQMKKH PCRQCDKSFS SSHSLCRHNR DCHKGIRKVY 1080 
ACSHCPDSRR TFTKRLMLEK HVQLMHGIKD PDLKEMTDAT NEEETEIKED TKVPSPKRKL 1 140 
EEPVLEFRPP RGATTQPLKK UC1NVFKVHK CAVCGFTTEN LLQFHEHIPQ HKSDGSSYQC 1200 
RECGLCYTSH VSLSRHLFTV HKLKEPQPVS KQNGAGEDNQ QENKPSHEDE SPDGAVSDRK 1260 
CKVCAKTFET EAALNTHMRT HGMAF1KSKR MSSAEK 
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SEQ iPNMSSPBMIDNA sequence 
Kuclac Add Accession!: AF1 11647 

Cooing sequence: 58-1603 (underlined sequence corresponds to start and slop codon) 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



TTTTCGTCGA 
GGGGACCCCA 
AACAAGGTGT 
GTGTTCCTTT 
ATTCGATCTA 
GGAGGAAACG 
AATGCCAAGT 
TCTCAAGCAA 
TTGTCCCCTC 
GACACAGCGT 
ACCACTTTGG 
GTACCAACAA 
AAAAAAGGCC 
TTTAATGAAA 
GCCAAGGTGG 
CTTGAAATTC 
TCAGACAGAC 
TCAGATATGC 
AATGATGACA 
GAGTTAAGGA 
GAGACCAGCA 
GCTCGCCGCA 
GGCAATGTCA 
GAGACCAGGG 
TTCGAGGAGC 
GCCCCCGACA 
TTTGCTAATG 
GATGTGTATT 
AGTGAAGTCC 
GATTTTTAAT 
CTGTTTCACG 
ATTTTCTTGG 
CTGAACCAGG 
GGGCTGCCAG 
TTCCTTCAAA 
CTACATAGTA 
TTATATGACC 
CTGCCCTGCC 
AAATGGAAGA 
ACGGTATACA 
TGTATG GGAG 
CTTTTCAGAA 
AGACAGCATT 
AAATGTGGTT 
CTAAATAGAA 
AAAAAAAAAA 
AAAAAAAAAA 



11 
I 

CTCTTACCGG 
GCAAGCAGGA 
GTTTTGATTG 
GCATTGATTG 
CAGAGTTGGA 
CTAGTGCATC 
ACAACAGTCG 
CACGGAAGCA 
CACCAAAGGA 
GGGCATCAGC 
AAAATAATGA 
AGGCTACTTT 
TTGGGGCCAA 
TTGAAAAACA 
TATCTAAAGA 
AAATGAAGAA 
TCGGCATGGG 
AGACCATAGA 
GTGACGATTC 
GCAGTTCTTT 
AAGATACTGA 
AGCCAGATTA 
AGGCCATTTC 
CCCGCCTAGA 
CGAGGAAGCA 
TGGCGCAGTT 
GAGTCGTGAC 
TCCTGGAGAA 
AGATAGTTTT 
ATTTCTTTTG 
TTCCTTCCTG 
AACCTTTGAT 
AGGCTTCATG 
AATCAGCGGA 
AGACCAAAAG 
AGGTGACTGC 
TATAAATTTA 
AAGGGAATTA 
TGAGAACTCC 
GAGTTAAAGT 
AACAGTCATT 
ATTACCGAAT 
AGAATATATT 
TTGAATGAAT 
TAAACACTTG 
AAAAAAAAAA 
AAAAAAAAAA 



21 
I 

TTGGCTGGGC 
CATCTTGACC 
TGGTGCCAAA 
CTCAGGGTCC 
TTCCAACTGG 
TTCCTTTTTT 
TGCTGCTCAG 
TGGCACTGAT 
GGAAGATTTT 
AATAGCAGAA 
AGGTGGACAA 
AGAGGTATCC 
AAAAGGAAGT 
AGCTCAAGCT 
AGAATCAATT 
AGACCAAAAG 
ATTTGGAAAT 
GCAGGAATCA 
ATATTTTACT 
CTCTAGCTGG 
AACAGTTCTG 
TGAGCCAGTT 
ATCAGATATG 
GAGGCTGTCG 
GCCAGCAGGG 
CAAGCAGGGA 
TTCAATTCAG 
ATTCCTCTTT 
GCAGATTGTT 
AGAAATTCTG 
TCACACCCTC 
TTCAACACTG 
TGGGGGAGGA 
TGCTGGATGG 
TGACTGGTGT 
CAAATAATAT 
AAAATGTTTT 
ATGTTATCTT 
CTAAGAGTTC 
GGAATGAGGT 
GTAATTGGGT 
GTGTATAAAC 
GTTCAGCACA 
ATTTTGTGAA 
CAGCAGAAAA 
AAAAAAAAAA 



AGGGCCTGGA 
GGAGAGGTCT 
GCCTGCAGAA 
CTCGTGTGAC 
TTGAAGTCAT 
TCAGTGAGTG 
GTGAAAGGTG 
TCATAATAAA 
AAGAAGATAC 
AGTTTTGTTA 
AAATAAAGAA 
GTAAAATATA 
TCTTTCTTAA 
AAAAAAAAAA 
AAAAAAAAAA 



41 
I 

GCGGCTCACA 
GCCTCCGCTC 
GGGCAAGCAT 
TTGGTGTTCA 
AGTTGCGATG 
GGTGTTCCAC 
AGAAAATCAA 
ATAGTTGTGT 
ACGTTTCTCC 
TAACATCAAG 
CAAGTGTGGA 
AAAAGAAACC 
AGAAACTGGC 
TGAAGGAGCA 
TACGATTAGC 
GTGGCAAAAA 
TTATTTCACA 
CAAAACCAAG 
GTTACTTTGA 
CAGATTCCTA 
GCTATTCAGA 
ATGAGGCCCA 
GACAATCCCA 
CCATAAGCTC 
TGTCCAGTGT 
TTGCTGGAAA 
GTTC TTAA TA 
GTAACCACAT 
TTCATATGGT 
TAGGAGCTTT 
TCTGTGTATA 
GACCTCGGCT 
CCATGTGACA 
ACAACACTCA 
AGATTGCTTC 
CTGTCTCTTT 
CTTTTAACAA 
TTGCTGTTTG 
TCATCTCATC 
AGCTACAGAA 
ATAAATATTT 
AAATAATTTA 
TTTGAAATTT 
AAGCTCAAAT 
AAAAAAAAAA 
AAAAAAAAAA 



51 
I 

GCTGACGATG 
GGTGCCCACT 
AACCTATGGA 
CTTGAGTTTT 
CATGCAAGTC 
CAATGACACC 
ATCGCTCGCC 
GGTTCCACCT 
TGAGGTGAGT 
GCCTGTGGAA 
AGGTCTTAAT 
AAATCAAGCT 
AAACACATGC 
GGAAGACCTG 
CTATAAGGAT 
AAATGTTGAC 
TTCAGTGACT 
AAAAAAGTAT 
CGAGCCAGTG 
TTGGAAAAAA 
CAGACCTACT 
GAAGAAGTTT 
GGCTGATTAT 
GGCTGATCTG 
GCTGCCCAAC 
ACTCTCCGTC 
CTGAAGTCAT 
CTCAGGCGGC 
ATATGTTTCT 
CCTGTGATTT 
TCCTTGCTTT 
CCTCCTGCTC 
CATGGGCTCA 
CCACACACAC 
ATTTATGTTT 
GTAAATTATT 
ACTTAAGCTT 
AATTGATGAG 
ACAAATCAAT 
AATAGTTGCG 
TTAAATCTTG 
GCTGTGTTTT 
GATAAGCCAA 
TTGTAGACTT 
AAAAAAAAAA 
AAAAAAAAAA 



60 
120 
1B0 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 



SEQ ID NO:257 PBM1 Protein sequence: 
PBM1 Protein sequence CAB76901 

MGDPSKQDEL TTFKRLRSVP TNKVCFDCGA KNPSWASITY GVFLCIDCSG SHRSLGVHLS 60 
FIRSTELDSN WSWFQLRCMQ VGGNASASSF FHQHGCSTND TNAKYNSRAA QLYREKIKSL 1 20 
ASQATRKHGT DLWLDSCWP PLSPPPKEED FFASHVSPEV SDTAWASAIA EPSSLTSRPV 1 80 
ETTLENNBGG QEQGPSVEGL NVFTKATLEV SSI1KKXPNQ AKKGLGAKKG SLGAQKLANT 240 
CFNEEKQAQ AADKMKEQED LAKWSKEES IVSSLRLAYK DLEIQMKKDE KMNISGKKNV 300 
DSDRLGMGFG NCRSVISHSV TSDMQT1EQE SPEMAKPRKX YNDDSDDSYF TSSSSYFDEP 360 
VELRSSSFSS WDDSSDSYWK KETSKDTETV LKTTGYSDRP TARRKPDYEP VENTDEAQKK 420 
FGNVKAISSD MYFGRQSQAD YETRARLERLSASSSISSAD LFEEPRKQPA GNYSLSSVLP 480 
NAPDMAQFKQ GVRSVAGKLS VFANGWTS! QDRYGS 



70 SEQ ID WO:258 PBM4 DNA seouence 
Nudefc Add Accession!: 030891 

Coding sequence: 1-4032 (undefined sequence corresponds to start and stop codon) 

- - ATGGATACTG TCATGAAGCA GAGACATGCT GACACACCTG TTGATCATTG TCTATCTGGC 60 
75 ATAAGAAAGT GTAGCAGCAC CTTTAAGCTT AAAAGTG AAG TCAACAAGCA TGAAACAGCC 120 
CTTG AAATGC AG AATCCAAA TTTGAACAAT AAAGAATGTT GTTTCACCTT TACGTTGAAT 180 
GGAAACTCCA G AAAATTAGA CCGT AGTGTG TTTACAGCAT ATGGTAAACC CAGCGAG ACT 240 
ATCTACTCAG CCCTG AGTGC TAATGACTAT TTCAGTGAAA GGATAAAG AA TCAGTTTAAT 300 
AAG AACATTA TTGTTTATGA AG AAA AGAC A ATAGATGGAC ATATAAATTT AGG AATGCCT 360 
80 CTCAAGTGCC TGCCTAGTG A TTCTCATTTT AAAATTACAT TTGGTCAAAG AAAGAGTAGC 420 
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AAAG AACATG G ACACATATT ACGCCAATGT GAAAATCCAA ACATGG AATG CATTCTTTTT 480 
CATGTTCTTG CTATAGGAAG GACAAGAAAG AAGATTGTTA AGATCAACGA ACTTCATGAA 540 
AAAGGAAGTA AACTTTGTAT TTATGCCTTG AAGGGTG AG A CTATTGAAGG AGCGTTATGC 600 
AAGG ATGGCC GTTTTCGGTC TGACATAGGT GAATTTGAAT GG AAACTAAA GG AAGGTCAT 660 
AAGAAAATTT ATGGAAAACA GTCCATGGTG GATG AAGTAT CTGGA AAAGT CTTAGAAATG 720 
GACATTTCAA AAAAAAAAGC ATTACAACAG AAAGATATCC ATAAAAAAAT TAAACAGAAT 780 
GAAAGTGCCA CTGATGAAAT TAATCACCAG AGTCTGATAC AGTCTAAGAA AAAAGTCCAC 840 
AAACCAAAGA AAGATGGAGA GACCAAAGAT GTAGAACACA GCAGAGAGCA AATTCTCCCA 900 
CCTCAGG ATC TAAGCCATTA TATTAAAGAT AAAACTCGCC AGACAATTCC CAGG ATTAG A 960 
AATTATTACT TTTGTAGTTT GCCCCGAAAA TATAGGCAAA TAAACTCACA AGTTAGACGG 1020 
AGCCCGCATC TGGGTAGGCG GTATGCTATT AATCTGGATG TCCAAAAGGA GGCAATTAAT 1080 
CTCTTAAAGA ATTATCAAAC GTTGAATGAA GCCATAATGC ATCAGTATCC GAATTTTAAA 1 140 
GAGGAGGCAC AGTGGGTAAG AAAATATTTT CGGGAAGAAC AAAAGAGAAT GAATCTTTCA 1200 
CCAGCTAAGC AATTCAACAT ATATAAAAAG GACTTCGGAA AAATGACTGC AAATTCTGTT 1260 
TCAGTTGCAA CCTGCGAACA GCTTACATAT TATAGCAAGT CAGTTGGGTT CATGCAATGG 1320 
GACAATAATG GAAACACAGG TAATGCTACT TGCTTTGTCT TCAATGGTGG TTATATTTTC 1 380 
ACCTGTCGAC ATGTTGTACA TCTTATGGTG GGTA AAAACA CACATCCAAG TTTGTGGCCA 1440 
GATATAATTA GCAAATGTGC GAAGGTAACC TTCACTTATA CAGAGTTCTG CCCTACTCCT 1500 
GACAATTGGT TTTCCATTGA GCCATGGCTT AAAGTGTCCA ATGAAAATCT AGATTATGCC 1560 
ATTTTAAAAC TAAAAGAAAA TGGAAATGCG TTTCCTCCAG GACTATGGCG ACAGATTTCT 1620 
CCTCAACCAT CTACTGGTTT GATTTATTTA ATTGGTCATC CTGAAGGCCA GATCAAGAAA 1680 
ATAGATGGTT GTACTGTG AT TCCTCTAAAC G AACG ATTGA AAAAATATCC AAACGATTGT 1 740 
CAAGATGGGT TGGTAGATCT CTATG ATACC ACCAGTAATG TATACTGTAT GTTTACCCAA 1800 
AGAAGTTTCC TATCAGAGGT TTGGAACACA CACACGCTTA GTTATGATAC TTGTTTCTCT 1860 
GATGGGTCCT CAGGCTCCCC AGTGTTTAAT GCATCTGGCA AATTGGTTGC TTTGCATACC 1920 
TTTGGGCTTT TTTATCAACG AGGATTTAAT GTGCATGCCC TTATTGAATT TGGTTATTCT 1 980 
ATGGATTCTA TTCTTTGTGA TATTAAAAAG ACAAATGAGA GCTTGTATAA ATCATTAAAT 2040 
GATGAGAAAC TTGAGACCTA CGATGAAGAG AAAGCCCGGC CCAGGCCAGC CTACCGGCGA 2100 
CTAGGATGCT TTCGCTTTCG CTCTCGCTTT CCAATACTCG GGACTGGGGA AACCGGGAGA 2160 
ATAGAAGCAG GCAAGGACCG CCGTGGGCAC GGGGTCAGTG AGACAGGGTC CTGCTCGCGG 2220 
CGTCAAGGAG GAGCGCTGTG GGTGTCCCCA GCGCAGCCAA TCGGCTTCCG AAGTAGCTGG 2280 
AGCTCTGGAG CCTTTGCTTC CTCAAATACG AGCGGG AACT GCGTTGAGCG CTGGATTCCA 2340 
GGCCGAGTGC TGGCGAGGCG CGCAGTCTCT AAAGAGCAAC AGAATAATTG CAGTACTTCT 2400 
CTAATGAGGA TGGAGTCTAG AGGAGACCCA AG AGCCACAA CTAATACCCA GGCTCAAAG A 2460 
TTCCATTCAC CTAAGAAAAA TCCAGAAGAC CAGACCATGC CCCAAAATAG GACAATATAT 2520 
GTTACCTTGA AGGCTGTCAG AAAAGAGATA GAAACTCACC AAGGCCAAGA AATGCTTGTG 2580 
CGTGGCACAG AAGGAATCAA AGAGTACATA AACCTTGGAA TGCCCCTCAG TTGTTTCCCT 2640 
GAAGGTGGCC AGGTGGTCAT TACATTTTCC CA A AGTAAAA GTAAGCAGAA GG AAG ATAAC 2700 
CACATATTTG GCAGGCAGGA CAAAGCATCG ACTGAATGTG TCAAATTTTA CATTCATGCA 2760 
ATTGGAATTG GG AAGTGTAA AAG AAGG ATT GTTAAATGTG GG AAGCTTCA CAAAAAGGGG 2820 
CGCAAACTCT GTGTTTATGC TTTCAAAGG A G AAACCATCA AGGATGCACT GTGCAAGGAT 2880 
GGCAGATTTC TTTCCTTTCT GGAGAATGAT GATTGG AAAC TCATTGAAAA CAATGACACC 2940 
ATTTTAGAAA GCACCCAGCC AGTTGATGAA TTAGAAGGCA GATACTTTCA GGTTGAGGTT 3000 
GAGAAAAGAA TGGTCCCCAG TGCAGCAGCT TCTCAG AATC CTG AGTCAGA GAAAAGAAAC 3060 
ACCTGTGTGT TGAGAGAACA AATCGTGGCT CAGTACCCCA GTTTGAAAAG AGAAAGTGAA 3120 
AAAATCATTG AAAACTTCAA GAAAAAAATG AAAGT AAAAA ATGGCGAAAC ATTATTTGAA 3180 
TTGCATAGAA C AACGTTTGG G AAAGT AACA AAAAATTCTT CTTCG ATTAA AGTAGTGAAA 3240 
CTTCTTGTAC GTCTCAGTGA CTCAGTTGGG TACTTATTCT GGGACAGTGC AACTACGGGT 3300 
TACGCCACCT GCTTTGTTTT TAAAGGATTG TTCATTTTAA CTTGTCGGCA TGTAATAGAT 3360 
AGCATTGTGG GAGACGGAAT AG AGCCAAGT AAGTGGGCAA CCATAATTGG TCAATGTGTA 3420 
AGGGTGACAT TTGGTTATG A AGAGCTAAAA GACAAGGAAA CAAACTACTT TTTTGTTGAA 3480 
CCTTGGTTTG AGATACATAA TG AAGAGCTT GACTATGCTG TCCTGAAACT GAAGGAAAAT 3540 
GGACAACAAG TACCTATGGA ACTATATAAT GGAATTACTC CTGTGCCACT TAGTGGGTTG 3600 
ATACATATTA TTGGCCATCC ATATGGAGAA AAAAAGCAGA TTGATGCTTG TGCTGTGATC 3660 
CCTCAGGGTC AGCG AGCAAA GAAATGTCAG GAACGTGTTC AGTCTAAAAA AGCAG AAAGT 3720 
CCAGAGTATG TCCATATGTA TACTCAAAGA AGTTTCCAGA AAATAGTTCA CAACCCTGAT 3780 
GTGATTACCT ATGACACTG A ATTTTTCTTT GGGGCTTCCG GCTCCCCTGT GTTTG ATTCA 3840 
AAAGGTTCAT TGGTGGCCAT GCATGCTGCT GGCTTTGCTT ATACTTACCA AAATGAGACT 3900 
CGTAGTATCA TTGAGTTTGG CTCTACCATG GAATCCATCC TCCTTGATAT TAAGCAAAG A 3960 
CATAAACCAT GGTATGAAG A AGTATTTGTA AATCAGCAGG ATGTAGAAAT GATGAGTG AT 4020 
GAGGACTTGUSAGAATTCAG TCTACTGG AT TTAAGGGAAT GGCTTATGGA GTTGTTATTT 4080 
CGTAGGCATT GAAAATGGTT TTCTAAACTC CAAAATGGTC ATCTTATCAA TAATAATAAT 4140 
ATTGACCATT TCCTATCTGC CAGGCATTTT TCTAAGCACA TGAAGAAATT AGTCCTAACA 4200 
ACACTATGAG ATGGACTATA ACTTGCCCAA ATI 1 1 11 1 1 1 TTTTTGAGAC TGAGTCTCAC 4260 
TCTGTCGCCT GGGCTGG AGT ACAGTGGTGC G ATCTCAGCT CACTGCAACT TCCACCTCCC 4320 
AGGTTCAAGC G ATTCTTATG CCTCAGTCTC CTGAGCAGCT GGGATTACAG GCAAACGCCA 4380 
CCACACCCAG CTAAATTTTT TTTTTTTTTT TGTATTTTTA GTAG AGACAG GGTTTCACCA 4440 
TGTTGGTCAG GCGGGTCTCG AACTCCTGAC CTCGTGATCC ACCTGCCTCG GCCTTCCAAA 4500 
GTGCTGGGAT TACAA GTTTG AGCCACTGCA CCTGGCTAAC TTGCCCTATT TTA AAGTCAA 4560 
GCAATGGGAA GAATAACAAG ATTATATAGT AATCAGTTTC ATGACACTAA AAGTCATATA 4620 
GTCATAGGGT TTTTTCATCT TTCATATCTT TGCCTAAATT CATTTGCTAC AGTGCAGGAA 4680 
CCAAAACTTG TTCATCTCAT G ATTCCCTAC ATCTGACATA AGGAAAGTAA GTGCTCAGAA 4740 
AAATGTGCAG GTCAATAAGT TGCAAAAGTT GGGGCTGCAA TTAATGCTAA CATAAGAGCT 4800 
AAATGCTTGA TTAGAAATGA TCTCAAAACC TTTTAG AATT TCCAAAATCT TCATATTACT 4860 
GAAACTGTCG GAATATATGG GTCCTGAAAT TCAG A AGATG ATAGT C ACTC TTCCCATATT 4920 
TATAGGCTAT TAAGGCAAGG GATATCTTAA ACATCATATT ACTTTATTTA GATTTCTACT 4980 
ACTCCAATTA TTAATGTTAT GTATTTCTCA TTGTTTTACT TCTTCATGGT ATTATGAAGA 5040 
CTATATAGAT GATTCAACCA AGCCTGCAAA tctccctctt GTGGAATTCC ACTGGACCCA 5100 
ATCTGTTTTC CATTTCCATT GCAATACT AC T AAAGCC ATA C AATATCAAG CACCCTCCCT 5160 
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CTACCTCCAG CGACTATCAC AGAAGAAGCA GGCATGTAAG ATTTTAAGGA CTGGTTTCGA 5220 
GGGGTCG AGT GTAGG AAAAC AGCCTGTTGC ATTGTAAGAG TGATGTCACC TTGAAGAGCA 5280 
GCTGGCATGA TGACTGCTGT TTGACTCCTG CATACCAAGA TATTCTGCAG CAATGTCTTT 5340 
AAACAGTGCC GGTAGTACAG ATAACCCCTC ATAAAGATGC TTATCTAACC TCCCCAGTGT 5400 
TCAGGTGTTT CACAAGAAAG TCTGAGATAT GACTAGCTAC ACGTTTTGCC AAAAATGCTT 5460 
GTTATATAAA GGGTAC 1 1 1 1 GGG AGGGTGA GTGCCGCCAT 7TAGTGGCTG CTAGAAACAT 5520 
TGCTTCTGTT TGTAAGTTCC TATTAAATGT TCTTTCTGAG AAAA AAAAA A A 
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SEQ ID NO:2S9 PBM4 Protein sequence: 
PBM4 Protein sequence BAB67788 

MDTVMKQTHA DTPVDHCLSG IRKCSSTFKL K5EVNKHETA LEMQNPNLNN KECCFTFTLN 60 
GNSRKLDRSV FTAYGKPSES IYSALSANDY FSERIKNQFN KNIIVYEEKT IDGHINLGMP 120 
LKCLPSDSHF KTTFGQRKSS KEDGH1LRQC ENPNMECILF HWAIGRTRK KJVKINELUE 180 
KGSKLCIYAL KGETIEGALC KDGRFRSDIG EFEWKLKEGH KKJYGKQSMV DEVSGKVLEM 240 
DISKKKALQQ KDIHKKIKQN ESATDEINHQ SUQSKKKVH KPKKDGETKD VEHSREQDLP 300 
PQDLSHY1KD KTKQTIPRER NYYFCS LPRK YRQINSQVRR RPHLGRRYAI NLDVQKEAIN 360 
LLKN YQTLNE AIMHQYPNFK EEAQWVRKYF REEQKRMNLS PAKQFNIYKK DPGKMTANSV 420 
SVATCEQLTY YSKSVGFMQW DNNGNTGNAT CFVFNGGYIF TCRHWHLMV GKNTHPSLWP 480 
DIISKCAKVT FTYTEPCPTP DNWFSIEPWL KVSNENLDYA ILKLKENGNA FPPGLWRQ1S 540 
PQPSTGUYL IGHPEGQIKK IDGCTVIPLN ERLKKYPNDC QDGLVDLYDT TSNVYCMFTQ 600 
RSFLSEVWNT HTLSYDTCFS DGSSGSPVFN ASGKLVAUfT FGLFYQRGFN VHAJUEFGYS 660 
MDSHjCDIKK tneslyksln DEKLETYDEE KARPRPAYRR LGCFRFRSRF PDjGTGETGR 720 
D3AGKDRRGH GVSETGSCSR RQGGALW VSP AQPIGFRSSW SSGAFASSNT SGNCVERWIP 780 
GRVLARRAVS KEQQNNCSTS LMRMESRGDP RATTNTQAQR FHSPKJCNPED QTMPQNRTTY 840 
VTLKAVRKE1 ETHQGQEMLV RGTEGIKEYI NLGMPLSCFP EGGQWITFS QSKSKQKEDN 900 
HIFGRQDKAS TECVKFYIHA IGIGKCKRRI VKCGKLHKKG RKLCVYAFKG ET1KDALCKD 960 
GRFLSFLEND DWKLIENNDT ILESTQPVDE LEGRYFQVEV EKRMVPSAAA SQNPESEKRN 1020 
TCVLREQIVA QYPSLKRESE KIIENFKKKM KVKNGETLFE LHRTTPGKVT KNSSS1KWK 1080 
LLVRLSDSVG YLFWDSATTG YATCFVFKGL F1LTCRHVID SIVGDGIEPS KWATOGQCV 1 140 
RVTFGYEELK DKETNYFFVE PWFEIHNEEL DYA VLKLKEN GQQVPMELYN GITPVPLSGL 1200 
IHHGHPYGE KKQ1DACAVI PQGQRAKKCQ ERVQSKKAES PEYVHMYTQR SFQKJVHNPD 1260 
VITYDTEFFF GASGSPVFDS KGSLVAMHAA GFAYTYQNET RSOEFGSTM ESILLDIKQR 1320 
HKPWYEEVFV NQQDVEMMSD EDL 



SEQ ID WO:260 PBQ1 DNA sequence 
Nucleic Add Accession*: NM_015642 

Coding sequence: 489-2489 (undeillned sequence corresponds to start and stop codon) 



ACATTTCAAA 
TACGAAGAAT 
CTCATGACAT 
TGCAGCCGCT 
AATTTACCTG 
AAAACCAGAA 
CGGGCCTTCC 
CAACACATTC 
GCAAGGGGAT 
TCGAGACCCT 
ACGGGAGCAT 
ACAAACTGCT 
TGCAAAAGCT 
TGCAGATCCT 
GCATCGTGTC 
CGCCGCGGGG 
ACCTGCAGAG 
CCATGCAGAA 
AGACTGCGCT 
TCCATGAGCG 
GCCGCAAGCA 
AGGAGATGGA 
ACGAATCCGA 
AAGGTGAAAG 
AGCAGCACTT 
AGGCTGCAGA 
CCTCTCCGGA 
GCTCCGACAA 
CAAGTACCCA 
TGACCTTGAC 
TCTTCACTAC 
CCCTGGCAGG 
CTGCACAGCT 
GGCAAGGCGA 
AGAACTACGT 
GTTGCCGCTC 



11 
I 

AAAAATACAT 
GAACTCTGAG 
TGCTGTCTGA 
CTCTGCTCCC 
AAGAGTGACA 
GGCATCTGAG 
CTGCCTGAAC 
ACTGACAAAC 
GACCGAGCGC 
CAACGAGCAG 
GCTGCGCGCA 
GCTTGGCTAC 
CATTGACTTC 
CACGGCCGCC 
ACAGAACGTG 
CACTCCCGAG 
CCACCCACAG 
TGGCAGCGGC 
CGGCCTGCCC 
CTCGCAGCAG 
GCCCCGGCCT 
GGACGATTAC 
GGAGTGCACG 
CTTCGACTCG 
TGGGCCTGGG 
AGCCCCCGCT 
GAGAAGCAAT 
GAGCGTCCTA 
GCTCTACTTA 
CAGCAACACG 
CCAGCCCGCG 
CCAGCAGACC 
GCCAGCGCCA 
AAAAAAGCCT 
CAAGCACATG 
CTTCTCCTTA 



21 
I 

AGACTGATGT 
AATGTTTGGA 
TCTTTGACCA 
TGCCCCAATG 
CCATTGATTT 
GAGAATGAGA 
TTTGAAGCTG 
TCTCACGCTC 
ATTCACAGCA 
CGCAACCGTG 
CACCGCTGCG 
AGCGACATCG 
ATGTACAGCG 
AGCATCCTCC 
GGCGATGTGT 
TCAGGCACGT 
CACAGCGTGG 
GAGCGCTCTT 
CGCGACCACC 
ATGGAGCGCT 
GTGCGCATCC 
GACTACTACG 
GAAGACACAG 
GGCGTCAGCT 
GCGGCGCGGG 
GAGGGTGGTC 
GAAGTGGAGA 
CAACAGCCTT 
CGCCAGACAG 
CAGGTCATTG 
GGCAGTGGCC 
CAGTTTGTGA 
CAGCCCCTGG 
TATGAGTGCA 
TTCGTACACA 
AAGGATTACC 



31 
I 

TTCAGACTTG 
GAATGTTTCA 
TCAGTCTGTG 
AACATCTGCA 
TGAAACTACT 
TTACTCAGCC 
TTTTGTCTCC 
ACACCGGGTC 
TCAACCTTCA 
GCCACTTCTG 
TGCTGGCAGC 
AGATCCCGTC 
GCGTGCTACG 
AGATCAAAAC 
TCCCGGGGAT 
CAGGCCAGAG 
ACAGGATCTA 
TTTACAGCGG 
ACATGGAAGA 
ACCTGTCCAC 
AGACCCTAGT 
GGCAGCAAAG 
ACCAGGCCGA 
CCTCCATAGG 
ACAGCCAGGC 
CGCAGACAAA 
TGGACAGCAC 
CGGTCAACAC 
AAACCCTCAC 
GCACAGCTGG 
CCAAGCCTTT 
CAGTGTCCCA 
CCTCATCCGC 
CTCTCTGCAA 
CAGGTGAGAA 
TTATCAAGCA 



41 
I 

TGCAGCATAA 
TCATTACTAA 
ACCTGCCCCT 
CTAGGCCCAA 
GAAGAAACCC 
GGGTGGATCC 
AGACCCAGCC 
ATCTGATTGT 
CAACTTCAGC 
TGACGTAACG 
CGGCAGCCCC 
GGTGGTGTCA 
GGTCTCGCAG 
AGTCATCGAC 
CCAGGACTCG 
CAGCGACACG 
CTCGGCACTC 
CGCAGTGGTC 
CCCCAGCTGG 
CACCCCCGAG 
GGGCAACATC 
GGTGCAGATC 
GGGCACCGAG 
CACCGAGCCT 
TGAACCCACC 
CCAGCTAGAA 
TGTTATCACT 
GTCCATCGGG 
CAGCAACCTG 
CAACACCTAC 
CCTCTTCAGC 
GCCCGGTCTG 
AGGCCACAGC 
CAAGACTTTC 
GCCCCACCAA 
CATGGTGACA 



51 
I 

GCCTACAGGG 
CAGGATATTC 
TCTCTTTACA 
GCCTTGGAGT 
AAGACAGCTG 
AGCGCCAAGC 
CTCATCCACT 
GACATCAGTT 
AATTCCGTGC 
GTGCGCATCC 
TTCTTCCAGG 
GTGCAGTCAG 
TCGGAAGCTC 
GAGTGCACGC 
GGCCAGGACA 
GAGTCGGGCT 
TACGCGTGCT 
AGCCACCACG 
ATCACACGCA 
ACCACGCACT 
CACATCAAGC 
CTGGAACGCA 
AGTGAGCCCA 
GACTCGGTGG 
CAACCCGAGC 
ACAGGTGCTT 
GTCAGCAACA 
CAGCCATTGC 
AGGATGCCTC 
CTGCCAGCCC 
CTGCCACAGC 
TCGACCTTTA 
ACAGCCAGTG 
ACCGCCAAAC 
TGCAGCATCT 
CACACAGGAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1660 
1920 
1980 
2040 
2100 
2160 
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TGAGGGCATA CCAGTGTAGT ATCTGCAACA AGCGCTTCAC CCAGAAGAGC TCCCTCAACG 2220 

TGCACATGCG CCTCCACCGG GGAGAGAAGT CCTACGAGTG CTACATCTGC AAAAAGAAGT 2280 

TCTCTCACAA GACCCTCCTG GAGCGACACG TGGCCCTGCA CAGTGCCAGC AATGGGACCC 2340 

CCCCTGCAGG CACACCCCCA GGTGCCCGCG CTGGCCCCCC AGGCGTGGTG GCCTGCACGG 2400 

AGGGGACCAC TTACGTCTGC TCCGTCTGCC CAGCAAAGTT TGACCAAATC GAGCAGTTCA 2460 

ACGACCACAT GAGGATGCAT GTGTCTCACG GATAAGTAGT ATCTTTCTCT CTTTCTTATG 2520 

AACAAAACAA AACAACAACA AAAAACAAAC AAACAAAAAA GCTATGGCAC TAGAATTTAA 2580 

GAAATGTTTT GGTTTCATTT TTACTTTCTG TTTTTGT T TT TGTTTCGTTT CATTTTGTAC 2640 

TACATGAAGA ACTGTTTTTT GCCTGCTGGT ACATTACATT TCCGGAGGCT TGGGTGAATA 2700 

ATAGTTTTCC CAGTCTCCCT CGGATGGTGG CCTTAAGGCC TGGTAGTGCT TCAAGAGGTC 2760 

CACTGGTTGG ATCTCTAGCT ACTGGCCTCT AAATACAACC CTTCTTTACA AAAAAAAAAA 2820 
AAAAAAAAA 



SEQ ID NO:261 PBQ1 Protein sequence: 
PBQ1 Protein sequence: NP.058457 

MTER1HSINL HNFSNS VLET LNEQRNRGHF CDVTVRIHGS MLRAHRCVLA AGSPFFQDKL 60 
LLGYSDEEIP SWSVQSVQK LIDFMYSGVL RVSQSEAXQI LTAASILQIK TVIDECTRIV 120 
SQNVGDVFPG IQDSGQDTPR GTPESGTSGQ SSDTESGYLQ SHPQHSVDR1 YSALY ACSMQ 1 80 
NGSGERSFYS G AWSHHETA LGLPRDHHME DPSWURIHE RSQQMERYLS TTPETTHCRK 240 
QPRPVR1QTL VGNIHIKQEM EDDYDYYGQQ RVQILERNES EBCTEDTDQA EGTESEPKGE 300 
SFDSGVSSSI GTEPDS VEQQ FGPGAARDSQ AEPTQPEQAA EAPAEGGPQT NQLETGASSP 360 
ERSNEVEMDS TVITVSNSSD KSVLQQPSVN TSIGQPLPST QLYLRQTETL TSNLRMPLTL 420 
TSNTQVIGTA GNTYLPALFT TQPAGSGPKP FLFSLPQPLA GQQTQFVTVS QPGLSTFTAQ 480 
LPAPQPLASS AGH5TASGQG EKKPYECTLC NKTFTAKQNY VKHMFVHTGE KPHQCSICWR 540 
SFSLKDYLIK HMVTHTG VRA YQCSICNKRF TQKSSLN VHM RLHRGEKSYE CY1CKKKFSH 600 
KTLLERHVAL HSASNGTPPA GTPPG ARAGP PGWACTEGT TYVCS VCPAK FDQIEQFNDH 660 
MRMHVSDG 



SEQ ID NO: 262 PBQ6 DNA sequence 
Nudek: Arid Accession* AI654187 

Cooing sequence: 1-912 (underlined sequence corresponds to start and stop codon) 
1 11 21 31 41 51 

I I I I I t 

ATGGTGGAAG AGGAAACAGG CATATCTTAC ATGGTGGCAG ACAAGGGACA CCCTTCTACA 60 

AACTCTACCA CTTCTGCGCC GTCGTTTCGA CCATATAAAA ACGACCTATG CGAACTGCGT 120 

CGGAAAACTC CCTCACGATG TAAAACGAAG ATCAGGAGCA GATTTGAAGA ATTACAAAGT 180 

GAATTGGTGC CAGTCAGCAT GTCAGAGACA GACCACATAG CCTCTACTTC CTCTGATAAA 240 

AATGTTGGGA AAACACCTGA ATTAAAGGAA GACTCATGCA ACTTGTTTTC TGGCAATGAA 300 

AGCAGCAAAT TAGAAAATGA GTCCAAACTA TTGTCATTAA ACACTGATAA AACTTTATGT 360 

CAACCTAATG AGCATAATAA TCGAATTGAA GCCCAGGAAA ATTATATTCC AGATCATGGT 420 

GGAGGTGAGG ATTCTTGTGC CAAAACAGAC ACAGGCTCAG AAAATTCTGA ACAAATAGCT 480 

AATTTTCCTA GTGGAAATTT TGCTAAACAT ATTTCAAAAA CAAATGAAAC AGAACAGAAA 540 

GTAACACAAA TATTGGTGGA ATTAAGGTCA TCTACATTTC CAGAATCAGC TAATGAAAAG 600 

ACTTATTCAG AAAGCCCCTA TGATACAGAC TGCACCAAGA AATTTATTTC AAAAATAAAG 660 

AGCGTTTCAG CATCAGAGGA TTTGTTGGAA GAAATAGAAT CTGAGCTCTT ATCTACGGAG 720 

TTTGCAGAAC ATCGAGTACC AAATGGAATG AATAAGGGAG AACATGCATT AGTTCTGTTT 780 

GAAAAGTGTG TGCAAGATAA ATATTTGCAG CAGGAACATA TCATAAAAAA GGCCAGACTT 840 

GGTCTCTGTT ATTTGCCATC AAGAACCTCA ATTGACACGT TAATTCCGTT TATCCCAAAT 900 
TTATATAGAT AA 



SEQ ID NO:263 PBQ6 Protein sequence: 
Protein Accession*: NP.060170 

MEPKEATGKE NMVTKKKNLA FLRSRLYMLE RRKTDTVVES SVSGDHSGTL RRSQSDRTEY 60 
NQKLQEKMTP QGECS V AETL TPEEEHHMKR MMAKREK1IK EUQTEKDYL NDLELCVREV 1 20 
VQPLRNKKTD RLDVDSLFSN IESVHQEAK LLSLLEEATT DVEPAMQVIG EVFLQIKOPL 1 80 
EDIYKIYCYH HDEAHSILES YEKEEELKEH LSHCIQSLK 



SEQ ID WO:264 PB Y7 DNA sequence 
Nucleic Acid Accession*: NM_014323 

Coding sequence: 662-2725 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 Si 

i I II I I 

GGGCCTACTC TGCCGCCGCC GCCGCCCGCC CGCTCCAGCC GCCGCCGCCG CCGCCACCGC 60 

CCTCCAGGCT CCGGGACCCG GCCCGCGCCA CCGCCCCCGT GCGCGCCCCG CCGCCGCCGC 120 

CTTCGCCTTC GCCTTTTGTT TCCTCCGCTC CGGCGCCCCC GCCCCGGCTC GCGCTTTGCA 180 

GGGGACGCAG CGCGCGCCCC CAGCGGGCCC GGGAAAAGCC GCGGCGCGCG CGCGCGCCTG 240 

CGCGGCGGAC CCCTCCTTCT CCTCCCCGCG TGCGCGTGCC CTTCTTGGCT GCGCGCCGGC 300 

GCCGCCTGGC GGGCGGGAGG GGAGGTGGCA GGCGCGTTTG CAGGAGGGGC GCACCTCTTC 360 

GCTCGCGCAC CCCCCCGGAA GGTAGACCGG GAAGGGGAGG CGGGCGGGCG GAGAGGAGAG 420 

AGTGGCGCGC AGTCCAGCGA GGGCGGGGGT TGGCTATGTG GGGGGTGGTG CACCCCGCAG 480 

TCTAGACAGT CTGATCCGGG CTGGGGGCGT GTACACTCGG CGCACCTGCG AGACTACAGA 540 

GCCTCGGGCC GGCACGTGTG GGGAGTGTGC ACACGTCTGC TGCGCCCCGC TTCTCGCTGC 600 

412 
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TGAGGGGAAG GGAGGGGGCG GGCAGGTGCA GCGGCCGGGC TAGTGGGAGG GGGCGGCGGC 660 

CATGGAGCGG GTGAACGACG CTTCGTGCGG CCCGTCTGGC TGCTACACAT ACCAGGTGAG 720 

CAGACACAGC ACGGAGATGC TGCACAACCT GAACCAGCAG CGCAAAAACG GCGGGCGCTT 780 

CTGCGACGTG CTCTTGCGGG TAGGCGACGA GAGCTTCCCA GCGCACCGCG CCGTGCTGGC 840 

CGCCTGCAGC GAGTACTTTG AGTCGGTGTT CAGCGCCCAG TTGGGCGACG GCGGAGCTGC 900 

GGACGGGGGT CCGGCTGATG TAGGGGGCGC GACGGCAGCA CCAGGCGGCG GGGCCGGGGG 960 

CAGCCGGGAG CTGGAGATGC ACACTATCAG CTCCAAGGTA TTTGGGGACA TTCTGGACTT 1020 

CGCCTACACT TCCCGCATCG TGGTGCGCTT GGAGAGCTTT CCCGAACTCA TGACGGCCGC 1080 

CAAGTTCCTG CTGATGAGGT CGGTTATCGA GATCTGCCAG GAAGTCATCA AACAGTCCAA 1140 

CGTACAGATC CTGGTACCCC CTGCCCGCGC CGATATAATG CTCTTTCGCC CCCCTGGGAC 1200 

CTCGGACTTG GGCTTCCCTT TGGACATGAC CAACGGGGCA GCCTTGGCAG CCAACAGCAA 1260 

TGGCATCGCC GGCAGCATGC AGCCAGAGGA GGAGGCAGCT CGGGCGGCTG GTGCAGCCAT 1320 

TGCAGGCCAA GCCTCTTTGC CTGTGTTACC TGGGGTGGAC CGCTTGCCCA TGGTGGCTGG 1380 

ACCCCTATCC CCCCAACTGC TGACTTCCCC ATTCCCCAGT GTGGCATCCA GTGCCCCTCC 1440 

CCTGACTGGC AAGCGAGGCC GGGGCCGCCC AAGGAAGGCC AACCTGCTGG ACTCAATGTT 1500 

TGGGTCCCCA GGGGGCCTGA GGGAGGCAGG CATCCTTCCA TGCGGTCTAT GTGGTAAGGT 1560 

GTTCACTGAT GCCAACCGGC TCCGGCAGCA CGAGGCCCAG CACGGTGTCA CCAGCCTCCA 1620 

GCTGGGCTAC ATCGACCTTC CTCCTCCGAG GCTGGGTGAG AATGGGCTAC CCATCTCTGA 1680 

AGACCCCGAC GGCCCCCGAA AGAGGAGCCG GACCAGGAAG CAGGTGGCTT GTGAGATCTG 1740 

CGGCAAGATC TTCCGTGATG TGTATCATCT TAACCGGCAC AAGCTGTCCC ACTCTGGGGA 1800 

GAAGCCCTAC TCCTGCCCTG TGTGTGGGTT GCGGTTCAAG AGAAAAGACC GCATGTCCTA 1860 

CCATGTGCGG TCCCATGATG GGTCCGTGGG CAAGCCTTAC ATCTGCCAGA GCTGTGGGAA 1920 

AGGCTTCTCC AGGCCTGATC ACTTGAACGG ACATATCAAG CAGGTGCACA CTTCTGAGCG 1980 

GCCTCACAAG TGTCAGACCT GCAATGCTTC TTTTGCCACC CGAGACCGTC TGCGCTCCCA 2040 

CCTGGCCTGT CATGAAGACA AGGTGCCCTG CCAGGTGTGT GGGAAGTACT TGCGGGCAGC 2100 

ATACATGGCA GACCACCTGA AGAAGCACAG CGAGGGGCCC AGCAACTTCT GCAGTATCTG 2160 

TAACCGAGGT TTCTCCTCTG CCTCCTACTT AAAGGTCCAT GTTAAAACCC ACCACGGTGT 2220 

TCCCCTTCCC CAGGTCTCCA GGCACCAGGA GCCCATCCTG AATGGGGGAG CAGCGTTCCA 2280 

CTGCGCCAGG ACCTATGGCA ACAAAGAAGG CCAGAAATGC TCACATCAGG ATCCGATTGA 2340 

GAGCTCTGAC TCCTATGGTG ACCTCTCAGA TGCCAGCGAC CTGAAGACGC CAGAGAAGCA 2400 

GAGTGCCAAT GGCTCTTTCT CCTGCGACAT GGCAGTCCCC AAAAACAAAA TGGAGTCTGA 2460 

TGGGGAGAAG AAGTACCCAT GCCCTGAATG TGGGAGCTTC TTCCGCTCTA AGTCCTACTT 2520 

GAACAAACAC ATCCAGAAGG TGCATCTCCG GGCTCTCGGG GGCCCCCTGG GGGACCTGGG 2580 

CCCTGCCCTT GGCTCACCTT TCTCTCCTCA GCAGAACATG TCTCTCCTCG AGTCCTTTGG 2640 

GTTTCAGATT GTTCAGTCGG CATTTGCGTC ATCTTTAGTA GATCCTGAGG TTGACCAGCA 2700 

GCCCATGGGG CCTGAAGGGA AATGAGGCAG CTGCTGTGTC CCCACGGAAA CAACCATCTG 2760 

GGGACTGCTG GGAAATGCTG TGAATGCGGA GGGAAGTGAT GTTTGGGTTC TGTAGCTGAG 2820 

AGATTTTTAT TCATTTTTAA CTGCCCCCCA ACCCCACTCC AACTCCTTCT CCACCACCCA 2880 

TTCTCCCAAT GGTCTTTAGA AATAGATTTT CATCTGATAT TCTGCAGAAA TATCAATGAG 2940 

ACTTGGTATG GGACAGGGGC AGAAAACACT ACATAGGCCT CCAAGGCAAA ACCAGTCCCA 3000 

GTTTCTTTAA TGGGAAGAAG CTGGAATTCC TGGTGCTCAA TTCTTAGTGA CCCCAATCCT 3060 

ATACCCAAAT CTATGATATT CTGGGACCTC AGTGATTTTG GTCCCCTCCC ACTTCTCTAG 3120 

TTCGTCATCC TCCCTTCCCA . TATCCTTCAA AAGAACCACA CTAGGGTCTC CACCTACTTA 3180 

TACAATGCGG ATGCCCAACT GTTTTTAAGG AAGCCAGAAG CATCCCATGG ACCATGGGGT 3240 

GAGTGTCCTC CAAGAGCCCC CTGAGCTCAG CCCTCTGCCT GGAGGGCTCC AGACCTTTCT 3300 

GAGCCCTGCT TGGAGGCGAG CATTTTCACT GCTAGGACAA GCTCAGCTGT TGAGGACACC 3360 

CCCACCCCAA ATTTCAGTTC TTACGTGATT TTAACCATTC AACATGCTGT TGGGTTTTAA 3420 

TTCTCTAATT ATTATTATTA TTGTTATTAT TTTTTAGGAC CAGTTGTAGT GAATTGCTAC 3480 

TGAAAGCTAT CCCAGGTGAT ACAGAGCTCT TTGTAAACCG CAGTCACACA TTAGGGTTAG 3540 

TATTAAACTT TGTTTAGATG TACCATAATT AACTTGGCTA GTTGATTGTT TGAAGTCTAT 3600 

GGAAGAAATA GTTTTATGCA AAATTTTAAA AAATGCCAGT CTGGTCAGGG AAGTAGGGGG 3660 

TTTCAATGCT GTTGGGAACC AGGAAGGTGG GACAGCCGGC AGGTAGGGAC ATTGTGTACC 3720 

TCAGTTGTGT CACATGTGAG CAAGCCCAGG TTGACCTTGT GATGTGAATT GATCTGATCA 3780 
GACTGTATTA AAAATGTTAG TACATTACTC TA 



SEQ ID NO:265 PBY7 Protein sequence: 
Protein Accession ft NP_1 14439 

MERVNDASCG PSGCVTYQVS RHSTEMLHNL NQQRKNGCRF CDVLLRVGDE SFPAHRAVLA 60 
ACSEYFESVF S AQLGDGGAA DGGPADVGG A TAAPGGGAGG SRELEMHTIS SKVFGDHJDF 120 
AYTSRIVVRL ESFPELMTAA KFLLMRSVIE ICQEVIKQSN VQILVPPARA DIMLFRPPGT 1 80 
SDLGFPLDMT NGAALAANSN G1AGSMQPEE EAARAAGAAJ AGQASLPVLP GVDRLPMVAG 240 
PLSPQ1XTSP FPSVASSAPP LTGKRGRGRP RKANLLDSMF GSPGGLREAG ILPCGLCGKV 300 
FTDANRLRQH EAQHGVTSLQ LGYIDLPPPR LGENGLPISE DPDGPRKRSR TRKQVACE1C 360 
GKIFRDVYHL NRHKLSHSGE KPYSCPVCGL RFKRKDRMSY HVRSHDGSVG KPYICQSCGK 420 
GFSRPDHLNG HIKQVHTSER PHKCQTCNAS FATRDRLRSH LACHEDKVPC QVCGKYLRAA 480 
YMADHLKKHS EGPSNFCSIC NREGQKCSHQ DPESSDSYG DLSDASDLKT PEKQSANGSF 540 
SCDMA VPKNK MESDGEKKYP CPECGSFFRS KSYLNKH1QK VHVRALGGPL GDLGPALGSP 600 
FSPQQNMSLL ESFGFQIVQS AFASSLVDPE VDQQPMGPEG K 



SPQlPN0;266P9Y9pNA5gqy?nc g 
Nucleic Acid Accession!: NM.012429 

Coding sequence: 174-1385 (undatined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I 1 I I I 

CCCTACTCCG CCTCTCGGGA TCCTTTAAGA GGCGGGGCTT GGCTGCCAGC TCCGCGGCCC 60 
GGGCAAAAGG CTGGGACTTT ACTCCGGGTG GCGGCGAGGA CGAGTCTGTG CTCCATCAGC 120 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TGCCGCACCC 
GCAGAGTCGG 
TCCAGGATGT 
GAGCCAGAAG 
GAAAGCAAAA 
ATCTGTCAGG 
TTGGACCTCT 
CCAAGATGCG 
GGAGGAAGGT 
TCTGGAAGCC 
CCGAAACACT 
ACCTCATCAA 
ATTGGAAGGA 
GCACCATGAC 
ACATCCCCAG 
AGATTTCCCG 
TCAGGTGGCA 
AGATGGGAGA 
ACTCCCACCT 
TGCGGTTTGA 
TCCTGCTTCC 
AATAACACCT 
CCTTGTAGCA 
CCTCAGGAGC 
TATCAAATAC 
CTGTAAACTG 
TGTACCACAG 
ACTTCAGGGA 
TCGCAATGAG 
TCCAAACATT 
GGCCTGAGTC 
GACTTTGGCA 
CTCAGAGCTT 
GGGAAATGAC 
GAATGCTAAA 
TCCTCCATGT 
GAGAGGGTGT 
CTGAGCAAGG 
GTTCAGGTGC 
GGCGGGGCCG 
CAGCCCTTAC 
ACATGGGAAG 
CCCGTCTGGG 
CGGACCGGAA 
TGGGTTTACA 



GCCGCCTCCC 
CGATCTGAGC 
GCTGCCGGCC 
CTTCGACCTG 
GGACATTGAC 
GGGTATGTGT 
GGATGCCAAG 
GGAGTGTGAG 
GGAGACCATC 
TGCTGTGGAG 
GAAGCGTCTT 
ACCCTTCCTG 
GGTTTTACTG 
TGACCCTGAT 
GAAGTATTAT 
TGGCTCCTCC 
GTTTATGTCA 
GAGGCAGCGG 
GGTCCCTGAA 
CAACACCTAC 
AGACAAAGCC 
TCTCCTATAG 
GTCATTTTCG 
TTTCATTTCA 
CTAAGGAGTC 
TGCCAACTTC 
GGTGGCAGCA 
AGTCAGCTGC 
GAGTAGCAGG 
TTAGCACTGA 
AGCACACATC 
ACTCCTGGGC 
CCTGGGACTT 
CCACAGGGAT 
AGCAGATCGT 
GAGCAACCCC 
TTGCCAGTCT 
TCTTACTAAG 
CGGTCGGCGT 
GCGTCTCGCA 
CCCAATCCCA 
GCGGCCCCAG 
AAGCTCATCT 
GGGGCCGAGG 
ACGCTGTTAG 



GCCCCCAAAC 
CCCAGGCAGA 
CTGCCGAATC 
CAGAAGTCGG 
AACATCATTA 
GGCTATGACC 



ACCATAATTT 
GCCTATGGAG 
TTTGTTGTTA 
AGTGAGGACA 
AAACATATCA 
GGAAACCCCA 
GTGCGAGACC 
CACCAAGTGG 
GATGGAGCGG 
GCAGGGGAGA 
GATGGGACCC 
AGCTTCATTC 
TCAGAAGAGA 
CAGGCCTGGC 
CACAACCCTG 
GTTAGGCAGA 
CCCAGGAGCT 
ACCTGTCCAG 
GGGAAAAAAA 
CGGGGAGAAA 
GTAGCTGGTT 
GGCTGGGGTA 
TTCCCACTCG 
CACACGGCCT 
CGGGTACCCA 
CGCAGCTGCA 
CCAGTGCCCT 
GAGACAAAAA 
GAGTGTCCCG 
CAGTCCCATC 
AGCCAGGCCT 
GACTAGGGGC 
CGAGCCCCGC 
ACCTGGCGGG 
TGCGAAGCTG 
CTGCACGGGC 
GAAAATTAAC 



CCCATCCCCG 
AGGAGGCATT 
CAGATGACTA 
AGGCCATGCT 
GCTGGCAGCC 
TGGATGGCTG 
TCTCAGCCTC 
AAGAGTGTGC 
ATGACTGCGA 
AGTTTCTCTG 
AAGCCCCCAA 
CTCGTAAGAA 
GCCCTGACCA 
AGTGCAAATC 
AGGTGAAACA 
AGTATGAGAT 
ATGTTGGTTT 
TGACAGAGGT 
TCACCTGCAG 
ATGCCAAGAA 
AGATGAAACA 
CCCCTCAGTG 
AAGCCCAAAG 
GGAAGAGCGA 
GGCTGGCCAT 
GGACAGCGAA 
TTAGAAAAGG 
CTTGCTCCTA 
GCTAGAGTTA 
GCTTTTGGCT 
GTAGACAGGC 
GCCTCTTTGA 



GGGAGGGCCA 
TTTCAGTGCT 
TGCTAAGTGG 
CGGTGCCCGC 
TCTGTGGGAG 
GGAGGCCCCC 
TGGGGGCGGC 
CAACGAACCA 
AACGCCTTTC 
AGGGAGCTCA 
CTCTGCCAGA 
CAATGAATAA 



CGGTTGAGCC 
GGCCAAGTTT 
TTTTCTCCTG 
CCGGAAGCAT 
TCCAGAGGTG 
CCCAGTCTGG 
CAAACAGGAC 
CCACCAGACC 
GGGGCTTGGC 
CATGTTTGAG 
ACTGTTTCCT 
GATCATGGTC 
GGTGCCTGTG 
CAAGATCAAC 
GCAGTATGAA 
CCTCTTCCCT 
TGGGATTTTC 
GCTGCCCAAC 
TGATCCTGGC 
GGTCAATTTC 
GCTGGGGGCA 
TCTCCCTGTC 
AAACTGGGCT 
CTGCAGTGGG 
CGTGATAGGA 
GCTGGGGGTG 
GTGAAAGATT 
AATGAACACA 
CGGTGGGGAT 
TTTCCCAGGT 
TGGCCTCTCC 
TTACTAATGA 
TCCATGCAAA 
GGGAGGTTGG 
ACCGGCCTCT 
GATCAAGAGA 
CAACCCGCTT 
GCATGCAACG 
CAGGCAGGAG 
CACAGACGGC 
CAGGTGCTGG 
CCTCAGAGCC 
GGGCAAAGGC 
ACGCTCAGGA 
AGCAACGTTC 



ACGATGAGCG 
CGGGAGAATG 
CGTTGGCTCC 
GTGGAGTTCC 
ATCCAACAGT 
TACGACATAA 
CTGCTGAGGA 
ACAAAGTTGG 
CTCAAGCATC 
GAAAATTATC 
GTGGCCTATA 
CTGGGAGCAA 
GAGTATGGGG 
TACGGGGGTG 
CACAGCGTGC 
GGCTGTGTCC 
CTGAAGACCA 
CAGAGGTACA 
ATCTATGTCC 
ACTGTGGAGG 
GGCACCCCGA 
AATTTCTACC 
GGAGGACAGA 
TCTCCGTGTC 
TCTGTCTGTC 
GCGGGGGGCA 
GGGACTTAAC 
TAAGTTTAGA 
CAGAAACTCT 
CTCAGGAGGT 
CTCACTTTGA 
TTGTCAGTGA 
CAAAGCGCCA 
GGGTGGGAGT 
CACCAAGCAG 
GCAGCACTCG 
CCTGACTGAC 
CGTGCAGGGA 
GCCGCCCAAA 
CTCGAAACCA 
GCTTTAGAGA 
AGGCCCCGGC 
CAGGCTAGCG 
CATCCCGGCC 
AGTGCGCA 



180 
2d0 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 



SEfl IP MW PBY9 Protein sconce: 
Protein Accession 0: NP_03656I 



MSGRVGDLSP RQKEALAKFR ENVQDVLPAL PNPDDYFLLR WLRARSFDLQ K5EAMLRKHV 60 
EFRKQKDIDN IISWQPPEVI QQYLSGGMCG YDLDGCPVWY DIJGPLDAKG LLFSASKQDL 120 
LRTKMRECEL LLQECAHQTT KLGRKVETTT HYDCEGLGL KHLWKPAVEA YGEFLCMFEE 180 
NYPETLKRLF VVKAPKLFPV AYNUKPFLS EDTRKXIMVL GANWKEVLLK HBPDQVPVE 240 
YGGTMTDPDG NPKCKSKINY GGDIPRKYYV RDQVKQQYEH SVQISRGSSH QVEYHLFPG 300 
CVLRWQFMSD GADVCFGIFL KTKMGERQRA GEMTEVLPNQ RYNSHLVPED GTLTCSDPGI 360 
YVLRFDNTYS FWAKKVNFT VEVLLPDKAS EEKMKQLGAG TPK 



SEQ ID NO:268 PBH8 DNA sequence 
Nucleic Add Accession* XM.009756 

Coding sequence: 30M440 (undenlned sequence corresponds to start and stop codon) 



l 
I 

GTGGGGACAG 
CTTGCTGCAG 
TATATCCGAG 
TATTTATGAA 
CCAGCCGCTG 
ATGAAATGTG 
CACTGCAGTG 
TGCTACCAGA 
GAGATCAAGC 
TTCCTGGATT 
ACCCTATACC 
CTGTTGGTGA 
TGGGTGTGGG 
TGCATCGTGA 
CTGGAGCAGG 



11 
I 

CCGAGCCGCG 
ACTTTGGATG 
ACCGCTTCTG 
TACATCCATC 
CACCACCACC 
TCTTGGCGAA 
GCTACTTGAA 
TTGTGGGGCT 
TGTACAGTAA 
CCAGGGTGAC 
ATCACGTGCA 
AGGGCCAGGT 
TGCAGAGCTA 
GTGTCAATTA 
TGTCCACTGC 



21 
I 

CCGGGCCCCT 
GATTTGTTTT 
TCCATTTAGG 
CTTCTGACCA 
TGCTCCAAGG 
AAGGAACGCG 
GATCAGGCAG 
GGTGGCCGTG 
CATGTTCATG 
CGAGGTGACG 
CGGCTGCGAC 
CACCACCAAG 
CGCCACCGTG 
TGTACTCACG 
CAAGTCCCAG 



31 
I 

GGACGGCGTC 
TGTGGTAGCA 
CTTATCCCAG 
CGATGAGATG 
TATGAGATAG 
GGCCTGACCT 
TATATGCTGG 
GGCCAGTCGC 
TTCAGGGCCA 
GGGTACGAGC 
GTGTTCCACC 
TACTACCGGC 
GTGCACAACA 
GAGATTGAAT 
GACTCCTGGA 



41 
I 

GCCAAGGAGC 
TCTGATGGCA 
GTGGAGCTCA 
ACCGCTGTCC 
AGAGGTCGTT 
GCAGCGGATA 
ACATGTCCCT 
TGCCACCCAG 
GCCTTGACCT 
CGCAGGACCT 
TCCGCTACGC 
TGCTGTCCAA 
GCCGCTCGTC 
ACAAGGAACT 
GGACCGCCTT 



51 
I 

TGGGATCGCA 

AAATCATGTA 

CGGGCAACAG 

TCACGGCCCA 

CTTTCTTCGA 

CAAGGTCATC 

GTACGACTCC 

TGCCATCACC 

GAAGCTGATA 

GATCGAGAAG 

ACACCACCTC 

GCGGGGCGGC , 

CCGGCCCCAC 

TCAGCTGTCC 

GTCTACCTCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



414 
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CAAGAAACTA GGAAATTAGT GAAACCCAAA AATACCAAGA TGAAGACAAA GCTGAGAACA 960 

AACCCTTACC CCCCACAGCA ATACAGCTCG TTCCAAATGG ACAAACTGGA ATGCGGCCAG 1020 

CTCGGAAACT GGAGAGCCAG TCCCCCTGCA AGCGCTGCTG CTCCTCCAGA ACTGCAGCCC 1080 

CACTCAGAAA GCAGTGACCT TCTGTACACG CCATCCTACA GCCTGCCCTT CTCCTACCAT 1140 

TACGGACACT TCCCTCTGGA CTCTCACGTC TTCAGCAGCA AAAAGCCAAT GTTGCCGGCC 1200 

AAGTTCGGGC AGCCCCAAGG ATCCCCTTGT GAGGTGGCAC GCTTTTTCCT GAGCACACTG 1260 

CCAGCCAGCG GTGAATGCCA GTGGCATTAT GCCAACCCCC TAGTGCCTAG CAGCTCGTCT 1320 

CCAGCTAAAA ATCCTCCAGA GCCACCGGCG AACACTGCTA GGCACAGCCT GGTGCCAAGC 1380 

TACGAAGGCA AGCAGATGTC CTCTGCGGAG ATACCGCCAG CTCCCCAGGA CGCAGACTGA 1440 
CTCCTGTTTG CTCGCTGGAC CAAC 



SEP tD NO:269 PBH8 Protein sequence: 
Pratefn Accession*: NPJJ05060 

MKEKSKNAAK TRREKENGEF YELAKLLPLP SAITSQLDKA SURLTTSYL KMRAVFPEGL 60 
GDAWGQPSRA GPLDGVAKEL GSHLLQTLDG FVFWASDGK 1MYISETASV HLGLSQVELT 120 
GNSIYEY1HP SDHDEMTAVL TAHQPLHHHL LQEYEIERSF FLRMKCVLAK RNAGLTCSG Y 1 80 
KV1HCSGYLK IRQYMLDMSL YDSCYQIVGL VA VGQSLPPS AITEIKLYSN MFMFRASLDL 240 
K1JFLDSRVT EVTGYEPQDL IEKTLYHH VH GCDVFHLRYA HHLLLVKGQV TTKYYRLLSK 300 
RGGWVWVQSY ATV VHNSRSS RPHCIVSVNY VLTEIEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQETRKLV KPKNTKMKTK LRTNPYPPQQ YSSFQMDKJLE CGQLGNWRAS PPASAAAPPE 420 
LQPHSESSDL LYTPSYSLPF SYHYGHFPLD SHVFSSKKPM LPAKFGQPQG SPCEVARFFL 480 
STLPASGECQ WHYANPLVPS SSSPAKNPPE PPANTARHSL VPSYEAPAAA VRRPGEDTAP 540 
PSFPSCGHYR EEPALGPAKA ARQA ARDGAR LALARAAPEC CAPPTPEAPG APAQLPFVLL 600 
NYHRVLARRG PLGG AAPAAS GLACAPGGPE AATG ALRLRH PSPAATSPPG APLPHYLGAS 660 
VITTNGR 



gEQ [D NQ:?7Q PBJ9 ON/ysequence: 
Nucleic Acid Accession*: AA760894 

GGCACGAGGA GAAGATGTGG CTTGCTCATG CTTGACTTCT GCCATGGTTG TGAGGCCTCC 60 
CCAGCCATGT GGAACTGTTT TCAGGTGCTG GTTCCATGGC TCTTCCTGAG CCGAAAATAA 120 
GGAAACTCCA TAGACCTTGT CCACTGGAAC TCGTTCCCAT CTACCCTCCA CTCTATCCAG 180 
GGTGATGG AT CTCTGCAGTA AGTGGAAG AG TTCTTCATGG CCCCCAAGGT TATATCCATC 240 
TAGAACTTCA GCACGTAATT TCATCTGG AA ATAGTGCCTT TGTGGATATA AGTTAGGTAA 300 
AACTG AAGAT GAGATCATAC TGG ATTAGGA TGGGATCTAA ATCCAATGAA AATGTCTTCA 360 
TAAAAAACAG GAAAGAACCC ATAGAAACAC AAGGAAGAAG GTCATGTGAA GATGGAGGCA 420 
. GAGATTGGAG GGATGCAGCC ACCGGCCCAG GAATGCCAGC AGCCACCCAG AAGCTGGAAG 480 
GAAATGAGGG ATTCTCTCCT AGAACCTTTA GAG AGRACAT GGTCCTGTGA ACAGCTTGAT 540 
TTTGGACTTG CCCATAGCTT GTATACTCTT ACTTTGG ATA CAATTTTATC CAAACTTGGC 600 
TAAACAGTTT CTCAGCCTAT GGAAAATTTA AAATGGAGAA GATTCAACTC GATTCTTACA 660 
GATTCAAAGC AAGAAAATG A TGGGAACATA GGAGGAGACC AAGAAAGCCT ATAAAAAGCA 720 
AAAATATGAA GTG AACATTG TGGTAGCTTT AAGATGTTTA GTGTAGCTGC AGGCACCCTA 780 
TACACATG AA AACCCCCAAG GGGAATCCCC ATATCACAGT GTAGTGTGAT ATTTGACATT 840 
YGTGATCATY TAGAGATGTA CAGAAAAGGT GAATCTGTGT TCTGTATATT CTGCCTAAGG 900 
CAAAGAAATG TTTAGCTYTC TTTAAA ATAG TTCCATAATT TTTTYTAAAA AGCTTTGCTT 960 
GAAAACTGTA AGCTTCCCAT ATCTGGAGCA TTTCACTTTA AATATTTGGA TAAATATGTT 1020 
ATCTTCTTAC TTGGACATTT CATGTGTTTA GGGATTGTYT TYTAAATTCT TCCTAATTCA 1080 
TATAGCTGCT AACACTTCCC GCAGAGCTAA ACCATTACAG ANTATGAAAT AAAGACCCTA 1 140 
TTGATTTG AA CTTAAAAAAA AAAAMAMAAA AAAAAAAAAA AAAAAAAAAT G A 

SEQ!DNO:271 PBQ4 DNA sequence 
Nucleic Add Accession*: AA149579 

Coring sequence: 1-1 363 (undefined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

I I I I I I 

ATGGAATCAA TCTCTATGAT GGGAAGCCCT AAGAGCCTTA GTGAAACTTG TTTACCTAAT 60 

GGCATAAATG GTATCAAAGA TGCAAGGAAG GTCACTGTAG GTGTGATTGG AAGTGGAGAT 120 

TTTGCCAAAT CCTTGACCAT TCGACTTATT AGATGCGGCT ATCATGTGGT CATAGGAAGT 180 

AGAAATCCTA AGTTTGCTTC TGAATTTTTT CCTCATGTGG TAGATGTCAC TCATCATGAA 240 

GATGCTCTCA CAAAAACAAA TATAATATTT GTTGCTATAC ACAGAGAACA TTATACCTCC 300 

CTGTGGGACC TGAGACATCT GCTTGTGGGT AAAATCCTGA TTGATGTGAG CAATAACATG 360 

AGGATAAACC AGTACCCAGA ATCCAATGCT GAATATTTGG CTTCATTATT CCCAGATTCT 420 

TTGATTGTCA AAGGATTTAA TGTTGTCTCA GCTTGGGCAC TTCAGTTAGG ACCTAAGGAT 480 

GCCAGCCGGC AGGTTTATAT ATGCAGCAAC AATATTCAAG CGCGACAACA GGTTATTGAA 540 

CTTGCCCGCC AGTTGAATTT CATTCCCATT GACTTGGGAT CCTTATCATC AGCCAGAGAG 600 

ATTGAAAATT TACCCCTACG ACTCTTTACT CTCTGGAGAG GGCCAGTGGT GGTAGCTATA 660 

AGCTTGGCCA CATTTTTTTT CCTTTATTCC TTTGTCAGAG ATGTGATTCA TCCATATGCT 720 

AGAAACCAAC AGAGTGACTT TTACAAAATT CCTATAGAGA TTGTGAATAA AACCTTACCT 780 

ATAGTTGCCA TTACTTTGCT CTCCCTAGTA TACCTCGCAG GTCTTCTGGC AGCTGCTTAT 840 

CAACTTTATT ACGGCACCAA GTATAGGAGA TTTCCACCTT GGTTGGAAAC CTGGTTACAG 900 

TGTAGAAAAC AGCTTGGATT ACTAAGTTTT TTCTTCGCTA TGGTCCATGT TGCCTACAGC 960 

CTCTGCTTAC CGATGAGAAG GTCAGAGAGA TATTTGTTTC TCAACATGGC TTATCAGCAG 1020 

GTTCATGCAA ATATTGAAAA CTCTTGGAAT GAGGAAGAAG TTTGGAGAAT TGAAATGTAT 1080 

ATCTCCTTTG GCATAATGAG CCTTGGCTTA CTTTCCCTCC TGGCAGTCAC TTCTATCCCT 1140 

TCAGTGAGCA ATGCTTTAAA CTGGAGAGAA TTCAGTTTTA TTCAGTCTAC ACTTGGATAT 1200 

415 
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GTCGCTCTGC TCATAAGTAC TTTCCATGTT TTAATTTATG GATGGAAACG AGCTTTTGAG 1260 
GAAGAGTACT ACAGATTTTA TACACCACCA AACTTTGTTC TTGCTCTTGT TTTGCCCTCA 1320 
ATTGTAATTC TGGATCTTTT GCAGCTTTGC AGATACCCAG AC TGA 

seq id w^gp9Q4 Prolan seqvwfi 

Protein Accession*: none 



l 11 21 31 41 51 

I I I I I I 

MESISMMGSP KSLSETCLPN GINGIKDARK VTVGVIGSGD FAKSLTIRLI RCGYHWIGS 60 
RNPKFASEFF PHWDVTHHE DALTKTNIIF VAIHREHYTS LWDLRHLLVG KILIDVSNNM 120 
RINQYPESNA EYLASLFPDS LIVKGFNWS AWALQLGPKD ASRQVYICSN NIQARQQVIE 180 
LARQLNFIPI DLGSLSSARE IENLPLRLFT LWRGPVWAI SLATFFFLYS FVRDVIHPYA 240 
RNQQSDFYKI PIEIVNKTLP IVAITLLSLV YLAGLLAAAY QLYYGTKYRR FPPWLETWLQ 300 
CRKQLGLLSF FFAMVHVAYS LCLPMRRSER YLFLNMAYQQ VHANIENSWN EEEVWRIEMY 360 
ISFGIHSLGL LSLLAVTSIP SVSNALNWRE FSFIQSTLGY VALLISTFHV LIYGWKRAFE 420 
EEYYRFYTPP NFVLALVLPS IVILDLLQLC RYPD 

SEQ 10 NO:273 PBQ5 DNA SEQUENCE 

Nucleic Add Accession*: NM.001973 

Coding sequence: 1 50-1 445 (underlined sequence corresponds to start and stop cod on) 



1 11 21 

I I I 

CCGCCGCCTT CTACTCCGCC GCGGGGGTCG 
AGCGTGAGGA GGAGGCTGAG GGCGGAGAGG 
GAGCCCCGCG CGCGGCGTCG CTCATTGCTA 
TTCTTCAGCT CCTGCAGAAG CCTCAGAACA 
GGCAGTTTAA GCTTTTGCAG GCAGAAGAGG 
AGCCTAACAT GAATTATGAC AAACTCAGCC 
TCATCAAAAA AGTGAATGGT CAGAAGTTTG 
TGAACATGGA TCCAATGACA GTGGGCAGGA 
GTGAAGTCAG CAGCAGTTCC AAAGATGTGG 
CTGGTGCCAA GACCTCTAGC CGCAATGACT 
CTCTCAACTC TTTGAACTCC TCCAATGTAA 
CAGCCGAGAA ACTGGCAGAG AAAAAATCTC 
TTGTCACGAC ACCTTCCAAA AAGCCACCAG 
GCCCAAGTAT TTCTCCATCT TCAGAAGAAA 
CAAAACTGCC TTCCCTGGAA GCCCCAACCT 
CCACACCACC CATTTCGTCC ATACCCCCTT 
CACTGAGTTC TCACCCAGAC ATCGACACAG 
AACTTCCAGA GAATTTGTCT CTGGAGCCTA 
ACAAAGTAAA TAATTCATCA AGATCCAAGA 
TTGTGATCAC GAGCAGTGAT CCAAGCCCAC 
CTTCTCTTAC ACCAGCATTT TTTTCACAGA 
TCTCCAGTAT CCACTTCTGG AGTACTCTCA 
TGCAAGGTGC TAACACACTT TTCCAGTTTC 
CTCTGTCTGG GCTGGATGGA CCTTCCACCC 
CATAACCTAT GCACTTGTGG AATGAGAGAA 
GATTGCATTT GAAGTGAGCA ATTGATAGTT 
TTTGCCATTC CCCATTGAAA ACATCTTTTT 
ACTATATGTA TAAAAATGCC TTAATTGGAG 
TTTCTTTTTC TTTCCTTCCT TCCTTTTCTT 
CTGAAGAAGT TTTTGGTGGG CTTTAGTGAC 
TTACTCCTTC TGGCTATTGG GACCCTTTGG 
TTAAAGAAGT ATTTGTGAAA TGAAAAAAAA 
AAAAAAAAAA AAA 



31 41 51 

I I 1 

CAGCGGCTGC CGCGCCGTCC TCGAGTTTCC 60 

CGCATCGTGT TCGAGGCGGA GACCGAGGGG 120 

TCGACAGTGC TATCACCCTG TGGCAGTTCC 180 

AGCACATGAT CTGTTGGACC TCTAATGATG 240 

TGGCTCGTCT CTGGGGGATT CGCAAGAACA 300 

GAGCCCTCAG ATACTATTAT GTAAAGAATA 360 

TGTACAAGTT TGTCTCTTAT CCAGAGATTT 420 

TTGAGGGTGA CTGTGAAAGT TTAAACTTCA 480 

AGAATGGAGG GAAAGATAAA CCACCTCAGC 540 

ACATACACTC TGGCTTATAT TCTTCATTTA 600 

AGCTTTTCAA ATTGATAAAG ACTGAGAATC 660 

CTCAGGAGCC CACACCATCT GTCATCAAAT 720 

TTGAACCTGT TGCTGCCACC ATTTCAATTG 780 

CTATCCAAGC TTTGGAGACA TTGGTTTCCC 840 

CTGCCTCTAA CGTAATGACT GCTTTTGCCA 900 

TGCAGGAACC TCCCAGAACA CCTTCACCAC 960 

ACATTGATTC AGTGGCTTCT CAGCCAATGG 1020 

AAGACCAGGA TTCAGTCTTG CTAGAAAAGG 1080 

AACCCAAAGG GTTAGGACTG GCACCCACCC 1140 

TGGGAATACT GAGCCCATCT CTCCCTACAG 1200 

CACCCATCAT ACTGACTCCA AGCCCCTTGC 1260 

GTCCTGTTGC TCCCCTAAGT CCAGCCAGAC 1320 

CTTCTGTACT GAACAGTCAT GGGCCATTCA 1380 

CTGGCCCATT TTCCCCAGAC CTACAGAAGA 1440 

CCGAGGAACG AAGAAACAGA CATTCAACAT 1500 

CTACAATGCT GATAATAGAC TATTGTCATT 1560 

AGGATTCTCT TTGAATAGGA CTCAAGTTGG 1620 

TCTAAACTCC ACCTCCCTCT GTCTTTTCCT 1680 

TTCTCCTTTA AAAATATTTT GAGCTTTGTG 1740 

TGTGCTTTGC AAAAGCAATT AAGAACAAAG 1800 

CCAGGAAAAA TTATGCTTAG AATCTATTAT I860 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1920 



SEffl ID NQ:274 PBQ 5 Prtf jfa sequence ; 
Protein Accession*: NP_001964 

MDSAJTLWQF UjQUjQKPQN KHMICWTSND GQFKLLQAEE V ARLWGIRKN KPNMNYDKJLS 60 
RALRYYYVKN UKKVNGQKF VYKFVSYPEI LNMDPMTVGR EGDCESLNF SEVSSSSKDV 120 
ENGGKDKPPQ PGAKTSSRND YIHSGLYSSF TLNSLNSSNV KLFKUKTEN PAEKLAEKKS 180 
PQEPTPSV1K FVTTPSKKPP VEPVAAT1SI GPSISPSSEE TIQALETLVS PKLPS LEAPT 240 
SA5NVMTAFA TTPPISSIPP LQEPPRTPSP PLSSHPDIDT DIDSVASQPM ELPENLSLEP 300 
KDQDSVLLEK DKVNNS5RSK KPKGLGLAPT LVrTSSDPSP LGILSPSLPT ASLTPAFFSQ 360 
TPOLTPSPL LSSIHFWSTL SPVAPLSPAR LQGANTLFQF PSVLNSHGPF TLSGUXjPST 420 
PGPFSPDLQK T 



SEQ ID NCh275 PBY3 DNA SEQUENCE 

NucJeic Add Accession*: AB040921 

Coding sequence: 131-2560 (underlined sequence corresponds to start and stop cod 00) 

1 11 21 31 41 51 

I I I I I I 

416 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



AATCAGGAAC 
AGATGGAACT 
GTATATTGAA 
GGTAAATTTA 
AACCACTCAA 
TTGCAGAATA 
AGCTGCAGAA 
CCAGAGTCGG 
TCAGTGGCTC 
CCATGAAAGA 
TCGATCTGAC 
ATATTTTGGT 
TCTTTTGGAA 
CCAGTTTAAG 
AGCAATATAT 
AAGTACTGTA 
TGCCCTCATC 
AGGCTGGGAC 
AGATAAATTT 
GTTTAAAAGA 
TAGCATTACC 
TTTTGATACT 
CAAACAGAGA 
TGGTCTTAGA 
GGAAGAACTT 
TAGATTAATG 
GCTGAACGCT 
ACCCGTTGAG 
CCCAGTACTC 
AAAAGAAAAG 
CTTAACAGTT 
CGAAAAGGAC 
CATGAAAGGA 
TAAAGATCCA 
TGCTGGTTTA 
GGTAAAAGTT 
GGAGCAAACA 
TATATACTTG 
CATTTCCATC 
TCAGTCTCCA 
TCTGCAAGAG 
CTGTGCAGTA 
GAACTTTCCG 
GAAAAGCCAG 
CTGGGACATG 
ATGTGCATGA 
AATATGTTCT 
CTTTATATAT 
AATCTTTCTG 
ACAAGTGTCA 
AGTAAATTAA 



AGATCATATA 
TTAGACCAAA 
ATGC AGCATT 
ATTGATAACC 
GTTACTCAGT 
GTTTGTACTC 
AGGGCAGAAT 
TTGCCAAGGA 
CAGTCAGACC 
AATCTGCAGT 
TTGAAAGTAA 
AACTGTCCAA 
GATGTAATTG 
AGGGGTTTCA 
AAAGAACGTT 
GATGTTATAG 
CGATACATTG 
AATATCAGCA 
TTAATTATAC 
ACCCCTCCTG 
ATAGATGATG 
CAGAACAATA 
AAAGGTCGAG 
GCAAGTCTTC 
TGTTTACAAA 
GACCCACCAT 
TTGGATAAAC 
CCACATATTG 
ACTATTGCTG 
ATTGCAGATG 
GTGAATGCGT 
TATTGCTGGG 
CAGTTTGCTG 
GAATCTAATA 
TATCCCAAAG 
TACACAAAAA 
GACTTTCACT 
TATGACTGCA 
CAGAAGGATA 
GCAAGAATTG 
AAGATTGAAA 
CTGTCAGCTA 
CCACGATTCC 
TTTGACAGCC 
AACAATTTTC 
CTTGATGTTA 
CTGATCATAT 
ATTGAGTATT 
CTCATAATGA 
ATTAAGAATT 
TTTGTTGTAA 



TTGACCGAGA 
AATTATTGGA 
TCAGAGAAAA 
ATCAGGTAAC 
TCATTTTGGA 
AGCCAAGAAG 
CTTGTGGCAG 
AACAGGGTTC 
CGTATTTGTC 
CAGATGTTTT 
TATTGATGAG 
TGATACATAT 
AAAAAATAAG 
TGCAAGGGCA 
GGCCAGATTA 
AAATGATGGA 
TTTTGGAAGA 
CTTTACATGA 
CTTTACATTC 
GTGTTCGGAA 
TCGTTTATGT 
TCAGTACAAT 
CTGGAAGAGT 
TAGATGACTA 
TAAAGATTTT 
CAAATGAGGC 
AAGAAGAATT 
GAAAAATGAT 
CTAGTCTCAG 
CAAGAAGAAA 
TTGAGGGCTG 
AATATTTTCT 
AGCATCTTCT 
TAAATTCAGA 
TTGCTAAAAT 
CCGATGGCCT 
ACAACTGGCT 
CAGAGGTTTC 
ACGATCAGGA 
CCCATCTTGT 
GTCCTCATCC 
TTATAGACTT 
AGGATGGATA 
ATTCTTCATC 
ATGTGTAAGG 
TATGTAGAGA 
ACTCTGCTGT 
GTACCACTTG 
TTGATGATAC 
TGAACACAAC 
TAAAGTCCAG 



TTCTGAGTAT 
AGATTTACAA 
GCTGCCTTCG 
AGTAATAAGT 
TAACTACATT 
AATTAGTGCC 
TGGTAATAGT 
TATCTTATAC 
CAGTGTTAGT 
AATGACTGTT 
TGCAACATTG 
ACCTGGTTTT 
GTATGTTCCA 
TGTAAATAGA 
TGTAAGGGAA 
GGATGATAAA 
AGAGGATGGT 
TCTCTTGATG 
ACTGATGCCT 
AATAGTAATT 
GATAGATGGA 
GTCCGCTGAG 
TCAACCTGGT 
TCAACTGCCA 
AAGGCTAGGT 
AGTGTTACTC 
GACACCTCTT 
TCTTTTTGGA 
TTTCAAAGAT 
GGAATTGGCA 
GGAAGAGGCT 
GTCTTCAAAC 
TGGAGCTGGA 
TAATGAGAAG 
TCGACTAAAT 
GGTTGCTGTT 
TATCTATCAC 
CCCATACTGT 
AACTATTGCT 
TAAGGAATTA 
TGTAGACTGG 
GATCAAAACA 
TTACAGCTGA 
ATTGTTTAAA 
TAGAAGCCTT 
TATATATATA 
GGTCATGCCC 
AGAAATTCCT 
CACCAGTAAA 
CACATTTTTT 
TATTTAATAA 



CTCTTGCAAG 
AAGAAAAAAA 
TATGGAATGC 
GGTGAAACTG 
GAAAGAGGAA 
ATTTCAGTTG 
ACTGGATATC 
TGTACAACAG 
CATATCGTAC 
GTTAAAGACC 
AATGCAGAAA 
ACCTTTCCGG 
GAACAAAAAG 
CAAGAAAAAG 
CTGCGAAGAA 
GTTGATCTGA 
GCGATACTGG 
TCACAAGTAA 
ACAGTTAACC 
GCTACCAACA 
GGAAAAATAA 
TGGGTTAGTA 
CATTGCTATC 
GAAATTTTGA 
GGAATTGCTT 
TCCATAAGAC 
GGAGTCCACT 
GCACTGTTCT 
CCATTTGTCA 
AAGGATACTA 
AGGCGACGTG 
ACACTGCAGA 
TTTGTAAGCA 
ATAATTAAAG 
TTGGGTAAAA 
CATCCTAAAT 
CTAAAGATGA 
CTCTTGTTTT 
GTAGATGAGT 
AGAAAGGAAC 
AATGACACTA 
CAGGAAAAGG 
CAGCTTTTCA 
TTTTGGCTGG 
CAGTAGGTAG 
TATATATATA 
ACTCTTTGGG 
TTGTTCTGTT 
AATAGGATGT 
AAAATGAAAC 
AATGTACAAT 



AAAATGAACC 
ATGACCTTCG 
AAAAGGAATT 
GTTGTGGCAA 
AAGGATCTGC 
CGGAAAGAGT 
AAATTCGTCT 
GAATCATCCT 
TTGATGAAAT 
TTCTCAATTT 
AGTTTTCAGA 
TTGTGGAATA 
AACACAGATC 
AAGAAAAAGA 
GGTATTCTGC 
ATTTGATTGT 
TCTTTCTGCC 
TGTTTAAATC 
AGACACAGGT 
TTGCGGAGAC 
AAGAGACGCA 
AAGCTAATGC 
ATCTGTATAA 
GAACTCCTTT 
ATTTTCTGAG 
ACCTGATGGA 
TGGCACGATT 
GCTGCTTAGA 
TTCCACTGGG 
GAAGTGATCA 
GTTTCAGATA 
TGCTGCATAA 
GTAGAAATCC 
CTGTCATCTG 
AAAGAAAAAT 
CTGTTAATGT 
GAACAAGCAG 
TTGGAGGTGA 
GGATTGTATT 
TAGATATTCT 
AATCCAGAGA 
CAACTCCCAG 
GGGGTGGTCT 
ATGCCAAACC 
TAAAGACTTA 
CCATAAAAGC 
AGTATATTCC 
ATACAAAATT 
TTACCCCAAA 
TTCTATCGGA 
GTTAAATCTC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1S00 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 



SEQ ID WO:276 PBY3 Protein seouence: 
Protein Accession!: BAA96012 

IRNRSYIDRD SEYLLQENEP DGTLDQKLLE DLQKKKNDLR YIEMQHFREK LPSYGMQKEL 60 
VNLIDNHQVT VISGETGCGK TTQVTQF1LD NYIERGKGSA CRJVCTQPRR ISA1SVAERV 120 
AAERAESCGS GNSTGYQIRL QSRLPRKQGS ILYCTTGIIL QWLQSDPYLS SVSHIVLDO 1 80 
HERNLQSDVL MTWKDIXNF RSDLKVELMS ATLNAEKFSE YFGNCPMIHI PGFTFPWEY 240 
LLEDVDBKIR YVPEQKEHRS QFKRGFMQGH VNRQEKEEKE AJYKERWPDY VRELRRRYS A 300 
STVDVIEMME DDKVDLNLPV AURYIVLEE EDGA1LVFLP GWDN1STLHD LLMSQVMFKS 360 
DKHJIPLHS LMPTVNQTQV FKRTPPGVRK fVIATNlAET SITIDDVVYV EDGGKIKETH 420 
FDTQNNISTM SAEWVSKANA KQRKGRAGRV QPGHCYHLYN GLRASLLDDY QLPEILRTPL 480 
EELCLQ1K.IL RLGGIAYFLS RLMDPPSNEA VLLSLRHLME LNALDKQEEL TPLGVHLARL 540 
PVEPHIGKMI LFGALFCCLD PVLTIAASLS FKDPFVIPLG KEKIADARRK ELAKDTRSDH 600 
LTWNAFEGW EEARRRGFRY EKDYCWEYFL SSNTLQMLHN MKGQFAEH1X GAGFVSSRNP 660 
KDPESNINSD NEKHKAVIC AGLYPKVAK1 RLNLGKXRKM VKVYTKTDGL VAVHPKSVNV 720 
EQTDFHYNWL IYHLKMRTSS IYLYDCTEVS PYCLLFFGGD ISIQKDNDQE TIAVDEWIVF 780 
QSPARIAHLV KELRKELDIL LQEKIESPHP VDWNDTKSRD CAVLSAIIDL IKTQEKATPR 840 
NFPPRFQDGY YS 

SEQ ID NO:277 PBY6 DNA SEQUENCE . 

Nucleic Add Accession*: AA464018 

Coding sequence: 64-l669(undertlned sequence corresponds to start and stop codon) 



GATTTTATCC TGGAACATTA CAGTGA AG AT GGCTATTTAT ATGAAG ATG A AATTGCAG AT 60 
CTTAJ2GATC TGAGACAAGC TTGTCGGACG CCTAGCCGGG ATGAGGCCGG GGTGGAACTG 120 
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CTGATGACAT ACTTCATCCA GCTGGGCTTT GTCGAG AGTC GATTCTTCCC GCCCACACGG ! 80 
CAGATGGG AC TCCTGTTCAC CTGGTATG AC TCTCTCACCG GGGTTCCGGT CAGCCAGCAG 240 
AACCTGCTGC TGGAGAAGGC CAGTGTCCTG TTCAACACTG GGGCCCTCTA CACCCAGATT 300 
GGGACCCGGT GTGATCGGCA GACGCAGGCT GGGCTGG AG A GTGCCATAGA TGCCTTTCAG 360 
AGAGCCGCAG GGGTTTTAAA TTACCTGAAA GACACATTTA CCCATACTCC AAGTTACGAC 420 
ATGAGCCCTG CCATGCTCAG CGTGCTCGTC A A AATG ATGC TTGCACAAGC CCAAGA AAGC 480 
GTGTTTGAGA AAATCAGCCT TCCTGGGATC CGGAATGAAT TCTTCATGCT GGTGAAGGTG 540 
GCTCAGGAGG CTGCTAAGGT GGGAGAGGTC TACCAACAGC TACACGCAGC CATGAGCCAG 600 
GCGCCGGTGA AAG AGAACAT CCCCTACTCC TGGGCCAGCT TAGCCTGCGT GAAGGCCCAC 660 
CACTACGCGG CCCTGGCCCA CTACTTCACT GCCATCCTCC TCATCGACCA CCAGGTGAAG 720 
CCAGGCACGG ATCTGGACCA CCAGGAGAAG TGCCTGTCCC AGCTCTACGA CCACATGCCA 780 
GAGGGGCTGA CACCCTTGGC CACACTGAAG AATGATCAGC AGCGCCGACA GCTGGGGAAG 840 
TCCCACTTGC GCAGAGCCAT GGCTCATC AC GAGGAGTCGG TGCGGGAGGC CAGCCTCTGC 900 
AAGA AGCTGC GGAGCATTG A GGTGCTAC AG AAGGTGCTGT GTGCCGCACA GGAACGCTCC 960 
CGGCTCACGT ACGCCCAGCA CCAGGAGGAG GATGACCTGC TGAACCTGAT CGACGCCCCC 1020 
AGTGTTGTTG CTAAAACTGA GCAAGAGGTT GACATTATAT TGCCCCAGTT CTCCAAGCTG 1080 
ACAGTCACGG ACTTCTTCCA GA AGCTGGGC CCCTTATCTG TGTTTTCGGC TAACAAGCGG 1 1 40 
TGGACGCCTC CTCG A AGCAT CCGCTTCACT GCAGAAG AAG GGGACTTGGG GTTCACCTTG 1 200 
AGAGGGAACG CCCCCGTTCA GGTTCACTTC CTGGATCCTT ACTGCTCTGC CTCGGTGGCA 1260 
GGAGCCCGGG AAGGAGATTA TATTGTCTCC ATTCAGCTTG TGGATTGTAA GTGGCTGACG 1320 
CTGAGTGAGG TTATGAAGCT GCTGAAG AGC TTTGGCG AGG ACGAGATCGA GATGAAAGTC 1380 
GTGAGCCTCC TGGACTCCAC ATCATCCATG CATAATAAGA GTGCCACATA CTCCGTGGGA 1440 
ATGCAGAAAA CGTACTCCAT GATCTGCTTA GCCATTGATG ATGACGACAA AACTGATAAA 1500 
ACCAAGAAAA TCTCCAAGAA GCTTTCCTTC CTGAGTTGGG GCACCAACAA GAACAGACAG 1560 
AAGTCAGCCA GCACCTTGTG CCTCCCATCG GTCGGGGCTG CACGGCCTCA GGTCAAGAAG 1620 
AAGCTGCCCT CCCCTTTCAG CCTTCTCAAC TCAGACAGTT CTTGGTACIA_A 



SEQ ID NO:278 PBY 6 Protein sequence: 
Protein Accession!: NPJ49094 

DFILEHYSED GYLYEDEIAD LMDLRQACRT PSRDEAG VEL LMTYF1QLGF VESRFFPPTR 60 
QMGLLFTWYD SLTG VPVSQQ NLLLEKASVL FNTG ALYTQ1 GTRCDRQTQA GLESAIDAFQ 1 20 
RAAGVLNYLK DTFTHTPSYD MSPAMLSVLV KMMLAQAQES VFEKISLPGI RNEFFMLVKV 180 
AQEAAKVGEV YQQLHAAMSQ APVKJEN1PYS WASLAC V KAH HYAALAHYFT AHJJDHQVK 240 
PGTDLDHQEK CLSQLYDHMP EGLTPLATLK NDQQRRQLGK SHLRRAMAHH EES VREASLC 300 
KKLRSIEVLQ KVLCAAQERS RLTYAQHQEE DDLLNLCDAP SWAKTEQEV DHLPQFSKL 360 
TVTDFFQKLG PLSVFSANKR WTPPRSIRFT AEEGDLGFTL RGNAPVQVHF LDPYCSASVA 420 
GAREGDYTVS IQLVDCKWLT LSEVMKLLKS FGEDEIEMKV VSLLDSTSSM HNKSATYSVG 480 
MQKTYSM1CL AIDDDDKTDK TKK1SKKLSF LSWGTNKNRQ KSASTLCLPS VGAARPQVKK 540 
KLPSPFSLLN SDSSWY 



SEQ ID NO:279 PBY8 ONA SEQUENCE 

Nucleic Acid Accession!: AF107493 

Coding sequence: 125-556 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

GAATTCGGCA CGAGCCTTGT TGGAGGTTCT GGGGCGCAGA ACCGCTACTG CTGCTTCGGT 
CTCTCCTTGG GAAAAAATAA AATTTGAACC TTTTGGAGCT GTGTGCTAAA TCTTCAGTGG 
GACAATGGGT TCAGACAAAA GAGTGAGTAG AACAGAGCGT AGTGGAAGAT ACGGTTCCAT 
CATAGACAGG GATGAOCGTG ATGAGCGTGA ATCCCGAAGC AGGCGGAGGG ACTCAGATTA 
CAAAAGATCT AGTGATGATC GGAGGGGTGA TAGATATGAT GACTACCGAG ACTATGACAG 
TCCAGAGAGA GAGCGTGAAA GAAGGAACAG TGACCGATCC GAAGATGGCT ACCATTCAGA 
TGGTGACTAT GGTGAGCACG ACTATAGGCA TGACATCAGT GACGAGAGGG AGAGCAAGAC 
CATCATGCTG CGCGGCCTTC CCATCACCAT CACAGAGAGC GATATTCGAG AAATGATGGA 
GTCCTTCGAA GGCCCTCAGC CTGCGGATGT GAGGCTGATG AAGAGGAAAA CAGGTGAGAG 
CTTGCTTAGT TCCTGATATT ATTGTTCTCT TCCCCATTCC CACCTCAGTC CCTAAAGAAC 
ATCCTGATTC CCCCAGTCTT CAAGCACATG AATTCAGAAT GAAAGGTTTG CCATGGCTAA 
GGAATGTGAC TCTTTGAAAA CCATGTTAGC ATCTGAGGAA CTTTTTTAAA CTTTCmTA 
GGGACTTTTT TTTCCTTAGG TAAGTAATGA TTTATAAACT CCTTTTTTTT TTTGACTATA 
GTCGGTTGCA TGGTTACTTT AAGCGTGGAA TCAAATGGAG TGGCATTTAG TTCAGGCGGC 
TTGTTCCTTG CCATGGCAAA GTATCAAGAA GATCCCCAAG TCAAGTCACA TTTGTAAAGC 
TGCTTCCCAA TTGGCTTTGT CACGCAGTGT TGAAGCAGTG GGAGAGAGAT TCACCTGTTA 
TAAAGGAACT GACTAACACA AGTATCCCGT CTATATCTGA ATGCTGTCTC TAGCTGTAAG 
CCGTGGTTTC GCCTTCGTGG AGTTTTATCA CTTGCAAGAT GCTACCAGCT GGATGGAAGC 
CAATCAGGTT GCTTCACTCA CCAAGTCTAG ATATTCATGA AAATGGAACA AGTCTGTACA 
ATTTTAAAAA AAGGTTGAAG GAGTGGTTTG TTCCAAAGGA GTGACTTTTT TTTAAAAAAA 
AAGCTTTGTA TATATTAAAA TTGATGTTAC TAGAATAAGT ACAGTACCAA GGACTTCATT 
ATAGAATTTG TTCTGCCTTT AAACATGGCT ACCTACCTGG CAGGGCTTTG TTAACTACTG 
AATACCTGTC TGGTAATCAC TAAAACATCT TTATGTTTCC CTTTTTTCTA GTTTGTTATA 
TTCCTATTAT GTCCATTGAG AGTAAGCTTA GTATATCAAA CTCTCCATTT GACAGTGAAG 
AGAACATAGT GAAAGTCTGT GGCGGCATTT TTATAAGTAA TTCCTTATTT CTGCCTGAAG 
ACCACAAAGC CTCCTGGAGG CGTAACTGCT CAGACCGGTC TTCAGGGAAT ATTTAAGGAC 
TTAGTGGAAT TTATGAACAA TAAGTCTGAT GAGATTAGCC TGGGAGTGGT GTCCTGCAGC 
TGTCTAATCT AGAGTGGCAT TAACATTCTA ATCTCCTTGA GAATGCCTTT TATAGTCTGT 
TCAAAGCAAG TCATTGATGG TTCTTCGAGG TAGTGTTAAC TGAAGTGTTC TTCAGTTTGT 
CAAGATAATG TTCAGTGCTT GGCACTTAAA TAACATTTTT TGCAAGAACT CCAAGGCACA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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900 
960 
1020 
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1380 
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1800 
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TTATTGAATG CCTTTAACCA AGTGCATTCT GGGAAGTTTG CTTGACTCAT TATCTTGCTT 1860 

TTCTGCAGCA TTCTGTGATT TGAGTCATCC ATGAATCCAT GAATAAAAGT TACATTCTTT 1920 

GATTGGTAAT ATTGCCATTT ATAACAAGAC TCACTAATGA GGGTATCACT TTGACTGACT 1980 

GATTTGTTAA AGTTTTTAAG CCTCTCATTT TCCTAACCCA GAAATCACAG CCTGATTTTA 2040 

TTAAAAGTAG AGCTTCATTC ATTTCATACC ATAGATACCA TCCTAGTAAA TCCAGAACAT 2100 

ATACAAGGTT CATGTGAGTC TGCTTTCTTG ACATGATAGC ATTGTTTGAT GCAGTGGATA 2160 

TGTCAGAATG ACTAACCTAG GAGTTTGAAA CTCCTAAGAA ACTAAAACCT GTAAGACATT 2220 

TAAAAGTCTC CACAATTTTA ATGTATACAA AGCTATGTTA CTGTGTAACA CATTACAGTT 2280 

CAAATTCACT CCAGAAATAA AAGGCCAGTA GGATTAGGGA CTCACTGGTA GTTTGGAGTC 2340 

TCCCAGCACA CATCCCTCCT AGTGGGATGA TCTATTCACA TATCTCCCAG CTTTTTTATT 2400 

TTTGCTTCTG TATATCACAG TGAGTGGATG GCCCTTCAGC TTTTTCTCTC CTGGCCAGAC 2460 

ATGCAGTCTT GCCTTTAGAT ATCGCAGAGA CAAAATTCAC AGCATGTCTT AAATCTTCCA 2520 

GGATTTGCAA GAACCAAATT GCTCAACAGT ATGTATGTTT AGAGGGGTTA GACTCCTTTT 2580 

TAAAATCTGG ATATCTAACC ACCTACTTAA ATCTGTTTGA TAGTGTCAAA CCACCCCCAC 2640 
CCTTGATCCT CCCACCCCCA AAAAAAAAAA AAAA 



SEP ID NO:280 PBY8 Protein sequence: 
Prolan Accession*: XP_003261 

MGSDKRVSRT ERSGRYGSII DRDDRDERES RSRRRDSDYK RSSDDRRGDR YDDYRDYDSP 60 
ERERERRNSD RSEDGYHSDG DYGEHDYRHD ISDERESKTI MLRGLP1TIT ESDIREMMES 120 
FEGPQPADVR LMKRKTGESL LSS 



SEQ ID NO:281 PCI2 DNA SEQUENCE 

Nucleic Acid Accession*: AF208291 

Coding sequence: 109-3705 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I t I I I I 

CGGCCGCTTT TTTCTCAAGA TGGCAGATTC CCACTGAGGC TGAGGGGGCC GAGCTCGCGC 60 

GCCGCGTTCC CTTCTCCGTT GCCATGAACC GCGGACACCC CGGCCCC GAT GG CCCCCGTG 120 

TACGAAGGTA TGGCCTCACA TGTGCAAGTT TTCTCCCCTC ACACCCTTCA ATCAAGTGCC 180 

TTCTGTAGTG TGAAGAAACT AAAAGTAGAG CCAAGTTCCA ACTGGGACAT GACTGGGTAC 240 

GGCTCCCACA GCAAAGTGTA CAGCCAGAGC AAGAACATAC CACCTTCTCA GCCAGCCTCC 300 

ACAACCGTCA GCACCTCCTT GCCGGTCCCA AACCCAAGCC TACCTTACGA GCAGACCATC 360 

GTCTTCCCAG GAAGCACCGG GCACATCGTG GTCACCTCAG CAAGCAGCAC TTCTGTCACC 420 

GGGCAAGTCC TCGGCGGACC ACACAACCTA ATGCGTCGAA GCACTGTGAG CCTCCTTGAT 480 

ACCTACCAAA AATGTGGACT CAAGCGTAAG AGCGAGGAGA TCGAGAACAC AAGCAGCGTG 540 

CAGATCATCG AGGAGCATCC ACCCATGATT CAGAATAATG CAAGCGGGGC CACTGTCGCC 600 

ACTGCCACCA CGTCTACTGC CACCTCCAAA AACAGCGGCT CCAACAGCGA GGGCGACTAT 660 

CAGCTGGTGC AGCATGAGGT GCTGTGCTCC ATGACCAACA CCTACGAGGT CTTAGAGTTC 720 

TTGGGCCGAG GGACGTTTGG ACAAGTGGTC AAGTGCTGGA AACGGGGCAC CAATGAGATC 780 

GTAGCCATCA AGATCCTGAA GAACCGCCCA TCCTATGCCC GACAAGGTCA GATTGAAGTG 840 

AGCATCCTGG CCCGGTTGAG CACGGAGAGT GCCGATGACT ATAACTTCGT CCGGGCCTAC 900 

GAATGCTTCC AGCACAAGAA CCACACGTGC TTGGTCTTCG AGATGTTGGA GCAGAACCTC 960 

TATGACTTTC TGAAGCAAAA CAAGTTTAGC CCCTTGCCCC TCAAATACAT TCGCCCAGTT 1020 

CTCCAGCAGG TAGCCACAGC CCTGATGAAA CTCAAAAGCC TAGGTCTTAT CCACGCTGAC 1080 

CTCAAACCAG AAAACATCAT GCTGGTGGAT CCATCTAGAC AACCATACAG AGTCAAGGTC 1140 

ATCGACTTTG GTTCAGCCAG CCACGTCTCC AAGGCTGTGT GCTCCACCTA CTTGCAGTCC 1200 

AGATATTACA GGGCCCCTGA GATCATCCTT GGTTTACCAT TTTGTGAGGC AATTGACATG 1260 

TGGTCCCTGG GCTGTGTTAT TGCAGAATTG TTCCTGGGTT GGCCGTTATA TCCAGGAGCT 1320 

TCGGAGTATG ATCAGATTCG GTATATTTCA CAAACACAGG GTTTGCCTGC TGAATATTTA 1380 

TTAAGCGCCG GGACAAAGAC AACTAGGTTT TTCAACCGTG ACACGGACTC ACCATATCCT 1440 

TTGTGGAGAC TGAAGACACC AGATGACCAT GAAGCAGAGA CAGGGATTAA GTCAAAAGAA 1500 

GCAAGAAAGT ACATTTTCAA CTGTTTAGAT GATATGGCCC AGGTGAACAT GACGACAGAT 1560 

TTGGAAGGGA GCGACATGTT GGTAGAAAAG GCTGACCGGC GGGAGTTCAT TGACCTGTTG 1620 

AAGAAGATGC TGACCATTGA TGCTGACAAG AGAATCACTC CAATCGAAAC CCTGAACCAT 1680 

CCCTTTGTCA CCATGACACA CTTACTCGAT TTTCCCCACA GCACACACGT CAAATCATGT 1740 

TTCCAGAACA TGGAGATCTG CAAGCGTCGG GTGAATATGT ATGACACGGT GAACCAGAGC 1800 

AAAACCCCTT TCATCACGCA CGTGGCCCCC AGCACGTCCA CCAACCTGAC CATGACCTTT 1860 

AACAACCAGC TGACCACTGT CCACAACCAG GCTCCCTCCT CTACCAGTGC CACTATTTCC 1920 

TTAGCCAATC CCGAAGTCTC CATACTAAAC TACCCATCTA CACTCTACCA GCCCTCAGCG 1980 

GCATCCATGG CTGCAGTGGC CCAGCGGAGC ATGCCCCTGC AGACAGGAAC AGCCCAGATT 2040 

TGTGCCCGGC CTGACCCGTT CCAGCAAGCT CTCATCGTGT GTCCCCCCGG CTTCCAAGGC 2100 

TTGCAGGCCT CTCCCTCTAA GCACGCTGGC TACTCGGTGC GAATGGAAAA TGCAGTTCCC 2160 

ATCGTCACTC AAGCCCCAGG AGCTCAGCCT CTTCAGATCC AACCAGGTCT GCTTGCCCAG 2220 

CAGGCTTGGC CAAGTGGGAC CCAGCAGATC CTGCTTCCCC CAGCATGGCA GCAACTGACT 2280 

GGAGTGGCCA CCCACACATC AGTGCAGCAT GCCACCGTGA TTCCCGAGAC CATGGCAGGC 2340 

ACCCAGCAGC TGGCGGACTG GAGAAATACG CATGCTCACG GAAGCCATTA TAATCCCATC 2400 

ATGCAGCAGC CTGCACTATT GACCGGTCAT GTGACCCTTC CAGCAGCACA GCCCTTAAAT 2460 

GTGGGTGTGG CCCACGTGAT GCGGCAGCAG CCAACCAGCA CCACCTCCTC CCGGAAGAGT 2520 

AAGCAGCACC AGTCATCTGT GAGAAATGTC TCCACCTGTG AGGTGTCCTC CTCTCAGGCC 2580 

ATCAGCTCCC CACAGCGATC CAAGCGTGTC AAGGAGAACA CACCTCCCCG CTGTGCCATG 2640 

GTGCACAGTA GCCCGGCCTG CAGCACCTCG GTCACCTGTG GGTGGGGCGA CGTGGCCTCC 2700 

AGCACCACCC GGGAACGGCA GCGGCAGACA ATTGTCATTC CCGACACTCC CAGCCCCACG 2760 

GTCAGCGTCA TCACCATCAG CAGTGACACG GACGAGGAGG AGGAACAGAA ACACGCCCCC 2820 

ACCAGCACTG TCTCCAAGCA AAGAAAAAAC GTCATCAGCT GTGTCACAGT CCACGACTCC 2880 

CCCTACTCCG ACTCCTCCAG CAACACCAGC CCCTACTCCG TGCAGCAGCG TGCTGGGCAC 2940 
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AACAATGCCA ATGCCTTTGA CACCAAGGGG 
CGAACCATCA TCGTGCCACC CCTGAAAACC 
AGCCTGGTGC CAGTCAACAC CAGTCACCAC 
AACGTGACCT CCACCAGCGG TCACTCTTCA 
CAGCAGCGGC CGGGCCCCCA CTTCCAGCAG 
CAGCACATCA CCACGGACCG CACTGGGAGC 
ACCATGGCCC AGGCTCCGTA CTCCTTCCCG 
CCGCATCTGG CTGCAGCCGC TGCCGCTGCC 
TACACTGCGC CGGCGGCCCT GGGCTCCACC 
GGCTCTGCGC GCCACACCGT GCAGCACACT 
CCCGTGAGCA TGGGCCCCCG GGTCCTGCCC 
GCCCAATTTG CCCACCAGAC CTACATCAGC 
TACCCACTGA GCCCCGCCAA GGTCAACCAG 
GAGGGAGGGA GGGAGGGAGA GAATGGCCCG 
CCTGGGACCG TGCGCGCTGG CCTTTTATAC 
GGGCAGGGGC GGGGGGGGGG GGGGCAGAGG 
CTTGAACCGG GAAGTGGGAG GACGTAGAGC 
TTAAAGAGGG TGGGAAATCT ATGGTTTTTA 



AGCCTGGAGA ATCACTGCAC GGGGAACCCC 3000 

CAGGCCAGCG AAGTATTGGT GGAGTGTGAT 3060 

TCGTCCTCCT ACAAGTCCAA GTCCTCCAGC 3120 

GGGAGCTCAT CTGGAGCCAT CACCTACCGG 3180 

CAGCAGCCAC TCAATCTCAG CCAGGCTCAG 3240 

CACCGAAGGC AGCAGGCCTA CATCACTCCC 3300 

CACAACAGCC CCAGCCACGG CACTGTGCAC 3360 

CACCTCCCCA CCCAGCCCCA CCTCTACACC 3420 

GGCACCGTGG CCCACCTGGT GGCCTCGCAA 3480 

GCCTACCCAG CCAGCATCGT CCACCAGGTC 3540 

TCGCCCACCA TCCACCCGAG TCAGTATCCA 3600 

GCCTCGCCAG CCTCCACCGT CTACACTGGA 3660 

TACCCTTACA TATAAACACT GGAGGGGAGG 3720 

AGGGAGGAGG GAGAGAAGGA GGGAGGCGCT 3760 

TGAAGATGCC GCACACAAAC AATGCAAACG 3840 

GCAGGGGGAC GGGTCGGGAC ACCAGTGAAA 3900 

AGAGAAGAGA ACATTTTTAA AAGGAAGGG A 3960 
TTTTAAAAAA 



SEQ ID NO:282 PC12 Protein sequence: 
Protein Accession #: NP_073577 

MAPVYEGMAS HVQVFSPHTL QSSAFCSVKK LKVEPSSNWD MTGYGSHSKV YSQSKNIPPS 60 
QPASTTVSTS LPVPNPSLPY EQTTVFPGST GHWVTSASS TSVTGQVLGG PHNLMRRSTV 120 
SLLDTYQKCG LKRKSEEIEN TSS VQIIEEH PPMIQNNASG ATVATATTST ATSKNSGSNS 1 80 
EGDYQLVQHE VLCSMTNTYE VLEFLGRGTF GQWKCWKRG TNETVAIKIL KNRPSYARQG 240 
QIEVSILARL STESADDYNF VRAYECFQHK NHTCLVFEML EQNLYDFLKQ NKFSPLPLKY 300 
IRPVIjQQVAT ALMKJLKSLGL IHADLKPENI MLVDPSRQPY RVKVIDPGSA SHVSKAVCST 360 
YLQSRYYRAP EHLGLPFCE AIDMWSLGCV IAELFLGWPL YPGASEYDQI RYISQTQGLP 420 
AEYLLS AGTK TTRFFNRDTD SPYPLWRLKT PDDHEAETGI KSKEARKY1F NCLDDMAQVN 480 
MTTDLEGSDM LVEKADRREF 1DLLKKMLTI DADKR1TP1E TLNHPFVTMT HLLDFPHSTH 540 
VKSCFQNMEI CKRR VNMYDT VNQSKTPFIT HVAPSTSTNL TMTFNNQLTT VHNQAPSSTS 600 
ATISLANPEV SILNYPSTLY QPSAASMAA V AQRSMPLQTG TAQICARPDP FQQAUVCPP 660 
GPQGLQASPS KHAGYS VRME NA VPIVTQAP GAQPLQIQPG LLAQQAWPSG TQQILLPPAW 720 
QQLTGVATHT SVQHATVIPE TMAGTQQLAD WRNTHAHGSH YNPIMQQPAL LTGHVTLPAA 780 
QPLNVGVAHV MRQQPTSTTS SRKSKQHQSS VRNVSTCEVS SSQA1SSPQR SKRVKENTPP 840 
RCAMVHSSPA CSTSVTCGWG DV ASSURER QRQTTVIPDT PSPTVSVITI SSDTDEEEEQ 900 
KHAPTSTVSK QRKNVISCVT VHDSPYSDSS SNTSPYSVQQ RAGHNN ANAF DTKGSLENHC 960 
TGNPRTUVP PLKTQASEVL VECDSLVPVN TSHHSSSYKS KSSSNVTSTS GHSSGSSSGA 1020 
ITYRQQRPGP HFQQQQPLNL SQAQQHITTD RTGSHRRQQA YITPTMAQAP YSFPHNSPSH 1080 
GTVHPHLAAA AAAAHLPTQP HLYTYTAPAA LGSTGTVAHL VASQGSARHT VQHTAYPAS1 1140 
VHQVPVSMGP RVLPSPTIHP SQYPAQFAHQ TYISASPAST VYTGYPLSPA KVNQYPYI 

SEQ ID N0283 PBY1 DNA SEQUENCE 

Nucleic Acid Accession*: NMJH7700 

Coding sequence: 147-606 (undefined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

AGTCACAGCC AGGTAACCCT GGAGTGAAGC GGTTTAGTTA GAAGGGAGCA GATAAACTCG 60 

TCACTCTAGT AGCTTTAACC CTCACCCTGA GGCACCTTAG CAATCAGCCA TTGCCTGCAA 120 

GCCTCCAAAG CTTGTCTTTG CCTAATATGG AGCCCAAAGA AGOCACTGGG AAAGAAAACA 180 

TGGTCACCAA GAAAAAGAAT CTGGCCTTCT TGAGGTCTAG ACTCTATATG CTGGAGAGAA 240 

GGAAGACTGA CACTGTGGTT GAGAGCAGTG TTTCTGGGGA CCACTCTGGC ACCTTGAGGA 300 

GGAGCCAATC TGACAGGACC GAATACAACC AGAAATTACA AGAAAAGATG ACTCCACAGG 360 

GTGAGTGTTC TGTAGCTGAG ACCTTAACCC CAGAGGAAGA GCATCATATG AAGAGGATGA 420 

TGGCAAAGCG GGAAAAGATC ATTAAGGAGC TGATACAGAC AGAAAAGGAT TATCTCAATG 480 

ATCTAGAGCT GTGTGTTAGG GAAGTGGTTC AGCCCCTGAG AAATAAAAAG ACTGATAGGC 540 

TGGATGTGGA TAGCTTGTTT AGCAACATTG AGTCCGTGCA TCAGATATCA GCCAAGCTGC 600 

TGTCATTGTT GGAAGAGGCC ACAACAGACG TGGAACCGGC CATGCAAGTA ATTGGAGAAG 660 

TATTCTTGCA GATTAAAGGG CCACTGGAAG ATATTTATAA AATCTACTGC TATCACCATG 720 

ATGAAGCACA TAGTATACTG GAGTCCTATG AAAAGGAAGA AGAGCTGAAG GAACATTTGA 780 

GCCACTGTAT CCAGTCCTTA AAGTAAGGCC TTTTCAAATG ATGATTCCCA TCTCCTCTCA 840 

GTTGCCTAGC AGGGAACATT TTAAATGGAT GTAGATGAAA GGTCTCACAT AAATCCTATG 900 

TTTTATGAGA CTTGCTGGGA GCTCTGCTTT GCATTCCCTT TATAAAAAGC TGACATGCCA 960 

GAAGCCCTGA TTGACTTTTT TTCCCCCTGC GAGAATGACT AAAAATAACA TGGAAGAAGA 1020 

TTTAGAGCTC TGCAGCGATT GAAAAATGCA ATATCAAAAT ATAAAATGTG GAAGAAAAGC 1080 

CTCTTCTTAA AGCTATTGTA ACTTGCCTGG CCCCACGTAG TTCAAGGATT ATGTGAGATA 1140 

ACACGTGGCC CCATGACCAC TGGAGCACAT GGGTTAATGG AGTTAGGGGA ATGGCCTACA 1200 

ACTCTGCATG GCCGTCTTCT TTCCCCAAAC TCACTGTGGG GAGATGGGTG AAGACAAGTC 1260 

AGGCCTTGTT AAAGTTAGTT TCAGAACAAT TACTCATGCC TTCCTTTCTC ATCCCTAAAA 1320 

CATTGGTGGG GGAGCTACAC AATGTACTTT TTCTTTTCTA GAGGAAGTAT CTATTCACTG 1380 

TGAAAATCTG AAAAATATAA CAAAGTATGT GTAAGATAAA AACCCCTTGC TATTTCAAAA 1440 
AAAAAAAAAA AAAAAAAAAA AAAA 

SEQ ID NCh284 PBY1 Protein Sidney; 
Protein Accession #: NPJK0170 

1 11 21 31 41 51 

420 
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30 



35 



I I I I I I 

KEPKEATGKE KMVTKKKNLA FLRSRLYMLE RRKTDTWES SVSGDHSGTL RRSQSDRTEY 60 

NQKLOEKMTP QGECSVAETL TPEEEHHMKR MMAKREKIIK ELIQTEKDYL NDLELCVREV 120 

VQPLRNKKTD RLDVDSLPSN IESVHQISAK LLSLLEEATT DVEPAMQVIG EVFLQIKGPL 180 
5 EDIYKIYCYH HDEAHSILES YEKEEELKEH LSHCIQSLK 

SEQ ID NO;2S5 PBQ9 DNA SEQUENCE 

Nudelc Add Accession*: X66534 
10 Coding sequence: 523-2676 (underlined sequence conespomis to start and stopcodon) 

1 11 21 31 41 51 

1C I II ill 

ID CCCTTATGGC GATTGGGCGG CTGCAGAGAC CAGGACTCAG TTCCCCTGCC CTAGTCTGAG 60 

CCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAGAAG CAGGTTTCAG TTGCAGAGTT 120 

TTCCTACACT TTTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGCGGAGGAC 180 

ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACCC CAGCCGGGCG TGATCTCACC 240 

ATGTGCGGAT TTGCGAGGCG CGCCCTGGAG CTGCTAGAGA TCCGGAAGCA CAGCCCCGAG 300 

20 GTGTGCGAAG CCACCAAGAC TGCGGCTCTT GGAGAAAGCG TGAGCAGGGG GCCACCGCGG 360 

TCTCCGGCCT GTCTGCACCC TGTCGCCTGA GCTGCCTGAC AGTGACAATG ACATCCCAGT 420 

TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TCTCATATAA GAACTACAGC 480 

TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA CCATGTTCTG CACGAAGCTC 540 

AAGGATCTCA AGATCACAGG AGAGTGTCCT TTCTCCTTAC TGGCACCAGG TCAAGTTCCT 600 

25 AACGAGTCTT CAGAGGAGGC AGCAGGAAGC TCAGAGAGCT GCAAAGCAAC CGTGCCCATC 660 

TGTCAAGACA TTCCTGAGAA GAACATACAA GAAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 

AGCCGAGTCT ATCTTCACAC TTTGGCAGAG AGTATTTGCA AACTGATTTT CCCAGAGTTT 780 

GAACGGCTGA ATGTTGCACT TCAGAGAACA TTGGCAAAGC ACAAAATAAA AGAAAGCAGG 840 

AAATCTTTGG AAAGAGAAGA CTTTGAAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 

CCAGTGGAGT TATCAAAGAA TCTCTTGGTG AAGAGGTTTT TAAAATATGT TACGAGGAAG 960 

ATGAAAACAT CCTTGGGGTG GTTGGAGGCA CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 

CCTTCTGAAA CAGAGCAGCC ATTGCCAAGA AGCAGGAAAA AGGGGCAGCT TGAGGACGCC 1080 

TCCATTCTAT GCCTGGATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 

AGAACCACCT CCCTGATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATGAA 1200 

ACGGAAGTGG AAGTGTCGTT AATGCCTCCC TGCTTCCATA ATGATTGCAG CGAGTTTGTG 1260 

AATCAGCCCT ACTTGTTGTA CTCCGTTCAC ATGAAAAGCA CCAAGCCATC CCTGTCCCCC 1320 

AGCAAACCCC AGTCCTCGCT GGTGATTCCC ACATCGCTAT TCTGCAAGAC ATTTCCATTC 1380 

CATTTCATGT TTGACAAAGA TATGACAATT CTGCAATTTG GCAATGGCAT CAGAAGGCTG 1440 

ATGAACAGGA GAGACTTTCA AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1500 

40 AAAATCAACC AGACCTTTAG CGGGATCATG ACTATGTTGA ATATGCAGTT TGTTGTACGA 1560 

GTGAGGAGAT GGGACAACTC TGTGAAGAAA TCTTCAAGGG TTATGGACCT CAAAGGCCAA 1620 

ATGATCTACA TTGTTGAATC CAGTGCAATC TTGTTTTTGG GGTCACCCTG TGTGGACAGA 1680 

TTAGAAGATT TTACAGGACG AGGGCTCTAC CTCTCAGACA TCCCAATTCA CAATGCACTG 1740 

AGGGATGTGG TCTTAATAGG GGAACAAGCC CGAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1800 

45 GGGAAGCTGA AGGCTACCCT TGAGCAAGCC CACCAAGCCC TGGAGGAGGA GAAGAAAAAG 1860 

ACAGTAGACC TTCTGTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 

CAAGTTGTGC AAGCCAAGAA GTTCAGTAAT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1980 

TTCACTGCCA TCTGCTCCCA GTGCTCACCG CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 

TACACTCGCT TCGACCAGCA GTGTGGAGAG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 

50 ATGCCTATTG TGTGGCTTGG GGGATTACAC AAAGAGAGTG ATACTCATGC TGTTCAGATA 2160 

GCGCTGATGG CCCTGAAGAT GATGGAGCTC TCTGATGAAG TTATGTCTCC CCATGGAGAA 2220 

CCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT CGTTGGAGTT 2280 

AAAATGCCCC GTTACTGTCT TTTTGGAAAC AATGTCACTC TGGCTAACAA ATTTGAGTCC 2340 

TGCAGTGTAC CACGAAAAAT CAATGTCAGC CCAACAACTT ACAGATTACT CAAAGACTGT 2400 

55 CCTGGTTTCG TGTTTACCCC TCGATCAAGG GAGGAACTTC CACCAAACTT CCCTAGTGAA 2460 

ATCCCCGGAA TCTGCCATTT TCTGGATGCT TACCAACAAG GAACAAACTC AAAACCATGC 2520 

TTCCAAAAGA AAGATGTGGA AGATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAGA 2580 

TTAGCAACCT ATATACCTAT TTATAAGTCT TTGGGGTTTG ACTCATTGAA GATGTGTAGA 2640 

GCCTCTGAAA GCACTTTAGG GATTGTAGAT GGCTAACAAG CAGTATTAAA ATTTCAGGAG 2700 

60 CCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 

TCTTCAAGAA AAAAAAAAAA ACCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2820 

AACCAGCACT TACTACCTGT ACTCAAAATT CAGCACCTTG TACATATATC AGATAATTGT 2880 

AGTCAATTGT ACAAACTGAT GGAGTCACCT GCAATCTCAT ATCCTGGTGG AATGCCATGG 2940 

TTATTAAAGT GTGTTTGTGA TAGTTGTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
AAAA 

SEQ ID NO:2B6 PBQ9 Protefri sequence: 
Protein Accession f: 002108 

70 1 11 21 31 41 51 

I I I 1 I I 

MFCTKLKDLK ITGECPFSLL APGQVPNESS EEAAGSSESC KATVPICQDI PEKNIQESLP 60 

QRKTSRSRVY LHTLAESICK L1FPEFERLN VALQRTLAKH K1KESRKSLE REDFEXTIAE 120 

QAVAAGVPVE VIRESLGEEV FKICYEEDEN ILGWGGTLK DFLNSFSTLL KQSSHCQEAG 180 

75 KRGRLEDASI LCLDKEDDFL HVYYFPPKRT TSULPGIIK AAAHVLYETE VEVSLKPPCF 240 

HNDCSEFVNQ PYLLYSVHMK STKPSLSPSK PQSSLVIPTS LFCKTFPFHF MFDKDMTILQ 300 

FGKGIRRLMN RRDFQGKPNF EEYFEILTPK INQTFSGIMT MLNMQFWRV RRWDNSVKKS 360 

SRVMDLKGQM IYIVESSAIL PLGSPCVDRL EDFTGRGLYL SDIPIHNALR DWLIGEQAR 420 

AQDGLKKRLG KLKATLEQAH QALEEEKKKT VDLLCSIFPC EVAQQLWQGQ WQAKKFSNV 480 

80 TMLFSDIVGF TAICSCCSPL QVITMLNALY TRFDQQCGEL DVYKVETIGD AYCVAGGLHK 540 
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ESDTHAVQIA LMALKMMELS DEVMSPHGEP IKMRIGLHSG SVFAGWGVK MPRYCLFGNN 600 

VTLANKFESC SVPRKINVSP TTYRLLKDCP GFVFTPRSRE ELPPNFPSEI PGICHFLDAY 660 
OQGTNSKPCF QKKDVEDGNA NFLGKASGID 

5 

SEQ ID N0:287 PFD2 DNA SEQUENCE 

Nudeic Add Accession!: NMJXW720 

Coding sequence: 1 19-6664 (underlined sequence corresponds to start and stop codon) 

10 

1 11 21 31 41 51 

I I I I I I 

AGAATAAGGG CAGGGACCGC GGCTCCTATC TCTTGGTGAT CCCCTTCCCC ATTCCGCCCC 60 

CGCCTCAACG CCCAGCACAG TGCCCTGCAC ACAGTAGTCG CTCAATAAAT GTTCGTGGAT 120 

15 GATGATGATG ATGATGATGA AAAAAATGCA GCATCAACGG CAGCAGCAAG CGGACCACGC 180 

GAACGAGGCA AACTATGCAA GAGGCACCAG ACTTCCTCTT TCTGGTGAAG GACCAACTTC 240 

TCAGCCGAAT AGCTCCAAGC AAACTGTCCT GTCTTGGCAA GCTGCAATCG ATGCTGCTAG 300 

ACAGGCCAAG GCTGCCCAAA CTATGAGCAC CTCTGCACCC CCACCTGTAG GATCTCTCTC 360 

CCAAAGAAAA CGTCAGCAAT ACGCCAAGAG CAAAAAACAG GGTAACTCGT CCAACAGCCG 420 

20 ACCTGCCCGC GCCCTTTTCT GTTTATCACT CAATAACCCC ATCCGAAGAG CCTGCATTAG 480 

TATAGTGGAA TGGAAACCAT TTGACATATT TATATTATTG GCTATTTTTG CCAATTGTGT 540 

GGCCTTAGCT ATTTACATCC CATTCCCTGA AGATGATTCT AATTCAACAA ATCATAACTT 600 

GGAAAAAGTA GAATATGCCT TCCTGATTAT TTTTACAGTC GAGACATTTT TGAAGATTAT 660 

AGCGTATGGA TTATTGCTAC ATCCTAATGC TTATGTTAGG AATGGATGGA ATTTACTGGA 720 

25 TTTTGTTATA GTAATAGTAG GATTGTTTAG TGTAATTTTG GAACAATTAA CCAAAGAAAC 780 

AGAAGGCGGG AACCACTCAA GCGGCAAATC TGGAGGCTTT GATGTCAAAG CCCTCCGTGC 840 

CTTTCGAGTG TTGCGACCAC TTCGACTAGT GTCAGGGGTG CCCAGTTTAC AAGTTGTCCT 900 

GAACTCCATT ATAAAAGCCA TGGTTCCCCT CCTTCACATA GCCCTTTTGG TATTATTTGT 960 

AATCATAATC TATGCTATTA TAGGATTGGA ACTTTTTATT GGAAAAATGC ACAAAACATG 1020 

30 TTTTTTT GCT GACTCAGATA TCGTAGCTGA AGAGGACCCA GCTCCATGTG CGTTCTCAGG 1080 

GAATGGACGC CAGTGTACTG CCAATGGCAC GGAATGTAGG AGTGGCTGGG TTGGCCCGAA 1140 

CGGAGGCATC ACCAACTTTG ATAACTTTGC CTTTGCCATG CTTACTGTGT TTCAGTGCAT 1200 

CACCATGGAG GGCTGGACAG ACGTGCTCTA CTGGGTAAAT GATGCGATAG GATGGGAATG 1260 

GCCATGGGTG TATTTTGTTA GTCTGATCAT CCTTGGCTCA TTTTTCGTCC TTAACCTGGT 1320 

35 TCTTGGTGTC CTTAGTGGAG AATTCTCAAA GGAAAGAGAG AAGGCAAAAG CACGGGGAGA 1380 

TTTCCAGAAG CTCCGGGAGA AGCAGCAGCT GGAGGAGGAT CTAAAGGGCT ACTTGGATTG 1440 

GATCACCCAA GCTGAGGACA TCGATCCGGA GAATGAGGAA GAAGGAGGAG AGGAAGGCAA 1500 

ACGAAATACT AGCATGCCCA CCAGCGAGAC TGAGTCTGTG AACACAGAGA ACGTCAGCGG 1560 

TGAAGGCGAG AACCGAGGCT GCTGTGGAAG TCTCTGGTGC TGGTGGAGAC GGAGAGGCGC 1620 

40 GGCCAAGGCG GGGCCCTCTG GGTGTCGGCG GTGGGGTCAA GCCATCTCAA AATCCAAACT 1680 

CAGCCGACGC TGGCGTCGCT GGAACCGATT CAATCGCAGA AGATGTAGGG CCGCCGTGAA 1740 

GTCTGTCACG TTTTACTGGC TGGTTATCGT CCTGGTGTTT CTGAACACCT TAACCATTTC 1800 

CTCTGAGCAC TACAATCAGC CAGATTGGTT GACACAGATT CAAGATATTG CCAACAAAGT 1860 

CCTCTTGGCT CTGTTCACCT GCGAGATGCT GGTAAAAATG TACAGCTTGG GCCTCCAAGC 1920 

45 ATATTTCGTC TCTCTTTTCA ACCGGTTTGA TTGCTTCGTG GTGTGTGGTG GAATCACTGA 1980 

GACGATCCTG GTGGAACTGG AAATCATGTC TCCCCTGGGG ATCTCTGTGT TTCGGTGTGT 2040 

GCGCCTCTTA AGAATCTTCA AAGTGACCAG GCACTGGACT TCCCTGAGCA ACTTAGTGGC 2100 

ATCCTTATTA AACTCCATGA AGTCCATCGC TTCGCTGTTG CTTCTGCTTT TTCTCTTCAT 2160 

TATCATCTTT TCCTTGCTTG GGATGCAGCT GTTTGGCGGC AAGTTTAATT TTGATGAAAC 2220 

50 GCAAACCAAG CGGAGCACCT TTGACAATTT CCCTCAAGCA CTTCTCACAG TGTTCCAGAT 2280 

CCTGACAGGC GAAGACTGGA ATGCTGTGAT GTACGATGGC ATCATGGCTT ACGGGGGCCC 2340 

ATCCTCTTCA GGAATGATCG TCTGCATCTA CTTCATCATC CTCTTCATTT GTGGTAACTA 2400 

TATTCTACTG AATGTCTTCT TGGCCATCGC TGTAGACAAT TTGGCTGATG CTGAAAGTCT 2460 

- GAACACTGCT CAGAAAGAAG AAGCGGAAGA AAAGGAGAGG AAAAAGATTG CCAGAAAAGA 2520 

55 GAGCCTAGAA AATAAAAAGA ACAACAAACC AGAAGTCAAC CAGATAGCCA ACAGTGACAA 2580 

CAAGGTTACA ATTGATGACT ATAGAGAAGA GGATGAAGAC AAGGACCCCT ATCCGCCTTG 2640 

CGATGTGCCA GTAGGGGAAG AGGAAGAGGA AGAGGAGGAG GATGAACCTG AGGTTCCTGC 2700 

CGGACCCCGT CCTCGAAGGA TCTCGGAGTT GAACATGAAG GAAAAAATTG CCCCCATCCC 2760 

TGAAGGGAGC GCTTTCTTCA TTCTTAGCAA GACCAACCCG ATCCGCGTAG GCTGCCACAA 2820 

60 GCTCATCAAC CACCACATCT TCACCAACCT CATCCTTGTC TTCATCATGC TGAGCAGCGC 2880 

TGCCCTGGCC GCAGAGGACC CCATCCGCAG CCACTCCTTC CGGAACACGA TACTGGGTTA 2940 

CTTTGACTAT GCCTTCACAG CCATCTTTAC TGTTGAGATC CTGTTGAAGA TGACAACTTT 3000 

TGGAGCTTTC CTCCACAAAG GGGCCTTCTG CAGGAACTAC TTCAATTTGC TGGATATGCT 3060 

GGTGGTTCGG GTGTCTCTGG TGTCATTTGG GATTCAATCC AGTGCCATCT CCGTTGTGAA 3120 

65 GATTCTGAGG GTCTTAAGGG TCCTGCGTCC CCTCAGGGCC ATCAACAGAG CAAAAGGACT 3180 

TAAGCACGTG GTCCAGTGCG TCTTCGTGGC CATCCGGACC ATCGGCAACA TCATGATCGT 3240 

CACTACCCTC CTGCAGTTCA TGTTTGCCTG TATCGGGGTC CAGTTGTTCA AGGGGAAGTT 3300 

CTATCGCTGT ACGGATGAAG CCAAAAGTAA CCCTGAAGAA TGCAGGGGAC TTTTCATCCT 3360 

CTACAAGGAT GGGGATGTTG ACAGTCCTGT GGTCCGTGAA CGGATCTGGC AAAACAGTGA 3420 

70 TTTCAACTTC GACAACGTCC TCTCTGCTAT GATGGCGCTC TTCACAGTCT CCACGTTTGA 3480 

GGGCTGGCCT GCGTTGCTGT ATAAAGCCAT CGACTCGAAT GGAGAGAACA TCGGCCCAAT 3540 

CTACAACCAC CGCGTGGAGA TCTCCATCTT CTTCATCATC TACATCATCA TTGTAGCTTT 3600 

CTTCATGATG AACATCTTTG TGGGCTTTGT CATCGTTACA TTTCAGGAAC AAGGAGAAAA 3660 

AGAGTATAAG AACTGTGAGC TGGACAAAAA TCAGCGTCAG TGTGTTGAAT ACGCCTTGAA 3720 

75 AGCACGTCCC TTGCGGAGAT ACATCCCCAA AAACCCCTAC CAGTACAAGT TCTGGTACGT 3780 

GGTGAACTCT TCGCCTTTCG AATACATGAT GTTTGTCCTC ATCATGCTCA ACACACTCTG 3840 

CTTGGCCATG CAGCACTACG AGCAGTCCAA GATGTTCAAT GATGCCATGG ACATTCTGAA 3900 

CATGGTCTTC ACCGGGGTGT TCACCGTCGA GATGGTTTTG AAAGTCATCG CATTTAAGCC 3960 

TAAGGGGTAT TTTAGTGACG CCTGGAACAC GTTTGACTCC CTCATCGTAA TCGGCAGCAT 4020 

80 TATAGACGTG GCCCTCAGCG AAGCGGACCC AACTGAAAGT GAAAATGTCC CTGTCCCAAC 4080 
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TGCTACACCT GGGAACTCTG AAGAGAGCAA TAGAATCTCC ATCACCTTTT TCCGTCTTTT 4140 

OCGAGTGATG CGATTGGTGA AGCTTCTCAG CAGGGGGGAA GGCATCCGGA CATTGCTGTG 4200 

GACTTTTATT AAGTCCTTTC AGCCGCTCCC GTATGTGGCC CTCCTCATAG CCATGCTGTT 4260 

CTTCATCTAT GCGGTCATTG GCATGCAGAT GTTTGGGAAA GTTGCCATGA GAGATAACAA 4320 

CCAGATCAAT AGGAACAATA ACTTCCAGAC GTTTCCCCAG GCGGTGCTGC TGCTCTTCAG 4360 

GTGTGCAACA GGTGAGGCCT GGCAGGAGAT CATGCTGGCC TGTCTCCCAG GGAAGCTCTG 4440 

TGACCCTGAG TCAGATTACA ACCCCGGGGA GGAGTATACA TGTGGGAGCA ACTTTGCCAT 4500 

TGTCTATTTC ATCAGTTTTT ACATGCTCTG TGCATTTCTG ATCATCAATC TGTTTGTGGC 4560 

TGTCATCATG GATAATTTCG ACTATCTGAC CCGGGACTGG TCTATTTTGG GGCCTCACCA 4620 

TTTAGATGAA TTCAAAAGAA TATGGTCAGA ATATGACCCT GAGGCAAAGG GAAGGATAAA 4680 

ACACCTTGAT GTGGTCACTC TGCTTCGACG CATCCAGCCT CCCCTGGGGT TTGGGAAGTT 4740 

ATGTCCACAC AGGGTAGCGT GCAAGAGATT AGTTGCCATG AACATGCCTC TCAACAGTGA 4800 

CGGGACAGTC ATGTTTAATG CAACCCTGTT TGCTTTGGTT CGAACGGCTC TTAAGATCAA 4860 

GACCGAAGGG AACCTGGAGC AAGCTAATGA AGAACTTCGG GCTGTGATAA AGAAAATTTG 4920 

GAAGAAAACC AGCATGAAAT TACTTGACCA AGTTGTCCCT CCAGCTGGTG ATGATGAGGT 4980 

AACCGTGGGG AAGTTCTATG CCACTTTCCT GATACAGGAC TACTTTAGGA AATTCAAGAA 5040 

ACGGAAAGAA CAAGGACTGG TGGGAAAGTA CCCTGCGAAG AACACCACAA TTGCCCTACA 5100 

GGCGGGATTA AGGACACTGC ATGACATTGG GCCAGAAATC CGGCGTGCTA TATCGTGTGA 5160 

TTTGCAAGAT GACGAGCCTG AGGAAACAAA ACGAGAAGAA GAAGATGATG TGTTCAAAAG 5220 

AAATGGTGCC CTGCTTGGAA ACCATGTCAA TCATGTTAAT AGTGATAGGA GAGATTCCCT 5280 

TCAGCAGACC AATACCACCC ACCGTCCCCT GCATGTCCAA AGGCCTTCAA TTCCACCTGC 5340 

AAGTGATACT GAGAAACCGC TGTTTCCTCC AGCAGGAAAT TCGGTGTGTC ATAACCATCA 5400 

TAACCATAAT TCCATAGGAA AGCAAGTTCC CACCTCAACA AATGCCAATC TCAATAATGC 5460 

CAATATGTCC AAAGCTGCCC ATGGAAAGCG GCCCAGCATT GGGAACCTTG AGCATGTGTC 5520 

TGAAAATGGG CATCATTCTT CCCACAAGCA TGACCGGGAG CCTCAGAGAA GGTCCAGTGT 5580 

GAAAAGAACC CGCTATTATG AAACTTACAT TAGGTCCGAC TCAGGAGATG AACAGCTCCC 5640 

AACTATTTGC CGGGAAGACC CAGAGATACA TGGCTATTTC AGGGACCCCC ACTGCTTGGG 5700 

GGAGCAGGAG TATTTCAGTA GTGAGGAATG CTACGAGGAT GACAGCTCGC CCACCTGGAG 5760 

CAGGCAAAAC TATGGCTACT ACAGCAGATA CCCAGGCAGA AACATCGACT CTGAGAGGCC 5820 

CCGAGGCTAC CATCATCCCC AAGGATTCTT GGAGGACGAT GACTCGCCCG TTTGCTATGA 5880 

TTCACGGAGA TCTCCAAGGA GACGCCTACT ACCTCCCACC CCAGCATCCC ACCGGAGATC 5940 

CTCCTTCAAC TTTGAGTGCC TGCGCCGGCA GAGCAGCCAG GAAGAGGTCC CGTCGTCTCC 6000 

CATCTTCCCC CATCGCACGG CCCTGCCTCT GCATCTAATG CAGCAACAGA TCATGGCAGT 6060 

TGCCGGCCTA GATTCAAGTA AAGCCCAGAA GTACTCACCG AGTCACTCGA CCCGGTCGTG 6120 

GGCCACCCCT CCAGCAACCC CTCCCTACCG GGACTGGACA CCGTGCTACA CCCCCCTGAT 6180 

CCAAGTGGAG CAGTCAGAGG CCCTGGACCA GGTGAACGGC AGCCTGCCGT CCCTGCACCG 6240 

CAGCTCCTGG TACACAGACG AGCCCGACAT CTCCTACCGG ACTTTCACAC CAGCCAGCCT 6300 

GACTGTCCCC AGCAGCTTCC GGAACAAAAA CAGCGACAAG CAGAGGAGTG CGGACAGCTT 6360 

GGTGGAGGCA GTCCTGATAT CCGAAGGCTT GGGACGCTAT GCAAGGGACC CAAAATTTGT 6420 

GTCAGCAACA AAACACGAAA TCGCTGATGC CTGTGACCTC ACCATCGACG AGATGGAGAG 6480 

TGCAGCCAGC ACCCTGCTTA ATGGGAACGT GCGTCCCCGA GCCAACGGGG ATGTGGGCCC 6540 

CCTCTCACAC CGGCAGGACT ATGAGCTACA GGACTTTGGT CCTGGCTACA GCGACGAAGA 6600 

GCCAGACCCT GGGAGGGATG AGGAGGACCT GGCGGATGAA ATGATATGCA TCACCACCTT 6660 

GTAGCCCCCA GCGAGGGGCA GACTGGCTCT GGCCTCAGGT GGGGCGCAGG AGAGCCAGGG 6720 

GAAAAGTGCC TCATAGTTAG GAAAGTTTAG GCACTAGTTG GGAGTAATAT TCAATTAATT 6780 

AGACTTTTGT ATAAGAGATG TCATGCCTCA AGAAAGCCAT AAACCTGGTA GG AACAGGTC 6840 

CCAAGCGGTT GAGCCTGGCA GAGTACCATG CGCTCGGCCC CAGCTGCAGG AAACAGCAGG 6900 

CCCCGCCCTC TCACAGAGGA TGGGTGAGGA GGCCAGACCT GCCCTGCCCC ATTGTCCAGA 6960 

TGGGCACTGC TGTGGAGTCT GCTTCTCCCA TGTACCAGGG CACCAGGCCC ACCCAACTGA 7020 

AGGCATGGCG GCGGGGTGCA GGGGAAAGTT AAAGGTGATG ACGATCATCA CACCTCGTGT 7080 

CGTTACCTCA GCCATCGGTC TAGCATATCA GTCACTGGGC CCAACATATC CATTTTTAAA 7140 
CCCTTTCCCC CAAATACACT GCGTCCTGGT TCCTGTTTAG CTGTTCTGAA ATA 

SEQ tD Hfr28B PFD2 Protein seouence: 
Proton Accession*: A38198 

1 11 21 31 41 51 

I I I I I I 

MMMMMMMKKM QHQRQQQADH ANEANYARGT RLPLSGEGPT SQPKSSKQTV LSWQAAIDAA 60 

RQAKAAQTHS TSAPPPVGSL SQRKRQQYAK SKKQGNSSNS RPARALFCLS LNNPIRRAC1 120 

SIVEWKPFDI FILLAIFANC VALAIYIPFP EDDSNSTNHN LEKVEYAFLI IFTVETFLKI 180 

IAYGLLLHPN AYVRNGWNLL DFVIVIVGLP SVILEQLTKE TEGGNHSSGK SGGFDVKALR 240 

AFRVLRPLRL VSGVPSLQW LNSIIKAMVP LLHIALLVLF VIIIYAIIGL ELFIGKMHKT 300 

CF FADS DIVA EEDPAPCAFS GNGRCCTANG TECRSGWVGP NGGITNFDNF AFAMLTVFQC 360 

ITMEGWTDVL YWVNDAIGWE WPWVYFVSLI ILGSFFVLML VLGVLSGEFS KEREKAKARG 420 

DFQKLREKQQ LEEDLKGYLD WITQAEDIDP ENEEEGGEEG KRNTSMPTSE TESVNTENVS 480 

GEGENRGCCG SLWCWWRRRG AAKAGPSGCR RWGQAISKSK LSRRWRRWNR FNRRRCRAAV 540 

KSVTFYWLVI VLVFLNTLTI SSEHYNQPDW LTQIQDIANK VLLALFTCEM LVKMYSLGLQ 600 

AYFVSLFNRF DCFWCGGIT ETILVELEIM SPLGISVFRC VRLLRIPKVT RHWTSLSNLV 660 

ASLLNSMKSI ASLLLLLFLF IIIFSLLGMQ LFGGKFNFDE TQTKRSTFDN FPOALLTVFQ 720 

ILTGEDWNAV MYDGIMAYGG PSSSGMIVCI YFIILFICGN YILLNVFLAI AVDNLADAES 780 

LNTAQKEEAE EKERKKIARK ESLENKKNNK PEVKQ1ANSD NKVTIDDYRE EDEDKDPYPP 840 

CDVPVGEEEE EEEEDEPEVP AGPRPRRISE LNMKEKIAPI PEGSAFFIL.S KTNPIRVGCH 900 

KLINHHIFTN LILVFIMLSS AALAAEDPIR SHSFRNTILG YFDYAFTAIF TVEILLKMTT 960 

FGAFLHKGAF CRNYFNLLDH LWGVSLVSF GIQSSAISW KIUWLRVLR PLRAINRAKG 1020 

LKHWQCVFV AIRTIGNIMI VTTLLQFMFA CIGVQLFKGK FYRCTDEAKS NPEECRGLFI 1080 

LYKDGDVDSP WRERlWQNS DFNFDNVLSA MMALFTVSTF EGWPALLYKA IDSNGENIGP 1140 

IYNHRVEISI FFIIYIUVA FFHMNIFVGF VIVTFQEQGE KEYKNCELDK NQRQCVEYAL 1200 

KARPLRRYIP KNPYQYKFWY WNSSPFEYM MFVL1MLNTL CLAMQHYEQS KKFNDAMDIL 1260 

NMVFTGVFTV EMVLKVTAFK PKGYFSDAWN TFDSLIVIGS IIDVALSEAD PTESENVPVP 1320 
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TATPGNSEES NRISITFFRL FRVMRLVKLL SRGEGIRTLL WTPIKSFQAL PYVALLIAML 1380 

FFIYAVIGMQ KFGKVAMRDN NQINRNNNFQ TFPQAVLLLP RCATGEAWQE IKLACLPGKL 1440 

CDPESDYNPG EEYTCGSNFA IVYFISFYML CAFLIINLFV AVIMDNFDYL TRDWSILGPH 1500 

HLDEFKRIWS EYDPEAKGRI KHLDWTLLR RIQPPLGFGK LCPHRVACKR LVAMNMPLNS 1560 

DGTVMFNATL FALVRTALKI KTEGNLEQAN EELRAVIKKI WKKTSMKLLD QWPPAGDDE 1620 

VTVGKFYATF LIQDYFRXFK KRKEQGLVGK YPAKNTTIAL QAGLRTLHDI GPEIRRAISC 1680 

DLQDDEPEET KREEEDDVFK RNGALLGNHV NHVNSORRDS LQQTNTTHRP LHVQRPSIPP 1740 

ASDTEKPLFP PAGNSVCHNH HNHNSIGKQV PTSTNANL/JN ANMSKAAHGK RPSIGNLEHV 1800 

SENGHHSSHK HDREPQRRSS VKRTRYYETY IRSDSGDEQL PTICREDPEI HGYFRDPHCL 1860 

GEQEYFSSEE CYEDDSSPTW SRQNYGYYSR YPGRNIDSER PRGYHHPQGF LEDDDSPVCY 1920 

DSRRSPRRRL LPPTPASHRR SSFNFECLRR QSSQEEVPSS PIFPHRTALP LHLMQQQIMA 1980 

VAGLDSSKAQ KYSPSHSTRS WATPPATPPY RDWTPCYTPL IQVEQSEALD QVNGSLPSLH 2040 

RSSWYTOEPD ISYRTFTPAS LTVPSSFRNK NSDKQRSADS LVEAVLISEG LGRYARDPKF 2100 

VSATKHEIAD ACDLTIDEME SAASTLLNGN VRPRANGDVG PLSHRQDYEL QDFGPGYSDE 2160 
EPDPGRDEED LADEMICITT L 

SEQ ID N0:289.OB(6 DMA SEQUENCE 

Nudete Acid Accession*: NM.0Q2B12 

Coding sequence: 1 50-3362 (underlined sequence corresponds to start and stop cod on) 



1 11 21 31 41 51 

I I I I I I 

AACTCCCGCC TCGGGACGCC TCGGGGTCGG GCTCCGGCTG CGGCTGCTGC TGCGGCGCCC 60 

GCGCTCCGGT GCGTCCGCCT CCTGTGCCCG CCGCGGAGCA GTCTGCGGCC CGCCGTGCGC 120 

CCTCAGCTCC TTTTCCTGAG CCCGCCGC GA TGGGAGCTGC GCGGGGATCC CCGGCCAGAC 180 

CCCGCCGGTT GCCTCTCCTC AGCGTCCTGC TGCTGCCGCT CCTGGGCGGT ACCCAGACAG 240 

CCATTGTCTT CATCAAGCAG CCGTCCTCCC AGGATGCACT GCAGGGGCGC CGGGCGCTGC 300 

TTCGCTGTGA GGTTGAGGCT CCGGGCCCGG TACATGTGTA CTGGCTGCTC GATGGGGCCC 360 

CTGTCCAGGA CACGGAGCGG CGTTTCGCCC AGGGCAGCAG CCTGAGCTTT GCAGCTGTGG 420 

ACCGGCTGCA GGACTCTGGC ACCTTCCAGT GTGTGGCTCG GGATGATGTC ACTGGAGAAG 480 

AAGCCCGCAG TGCCAACGCC TCCTTCAACA TCAAATGGAT TGAGGCAGGT CCTGTGGTCC 540 

TGAAGCATCC AGCCTCGGAA GCTGAGATCC AGCCACAGAC CCAGGTCACA CTTCGTTGCC 600 

ACATTGATGG GCACCCTCGG CCCACCTACC AATGGTTCCG AGATGGGACC CCCCTTTCTG 660 

ATGGTCAGAG CAACCACACA GTCAGCAGCA AGGAGCGGAA CCTGACGCTC CGGCCAGCTG 720 

GTCCTGAGCA TAGTGGGCTG TATTCCTGCT GCGCCCACAG TGCTTTTGGC CAGGCTTGCA 780 

GCAGCCAGAA CTTCACCTTG AGCATTGCTG ATGAAAGCTT TGCCAGGGTG GTGCTGGCAC 840 

CCCAGGACGT GGTAGTAGCG AGGTATGAGG AGGCCATGTT CCATTGCCAG TTCTCAGCCC 900 

AGCCACCCCC GAGCCTGCAG TGGCTCTTTG AGGATGAGAC TCCCATCACT AACCGCAGTC 960 

GCCCCCCACA CCTCCGCAGA GCCACAGTGT TTGCCAACGG GTCTCTGCTG CTGACCCAGG 1020 

TCCGGCCACG CAATGCAGGG ATCTACCGCT GCATTGGCCA GGGGCAGAGG GGCCCACCCA 1080 

TCATCCTGGA AGCCACACTT CACCTAGCAG AGATTGAAGA CATGCCGCTA TTTGAGCCAC 1140 

GGGTGTTTAC AGCTGGCAGC GAGGAGCGTG TGACCTGCCT TCCCCCCAAG GGTCTGCCAG 1200 

AGCCCAGCGT GTGGTGGGAG CACGCGGGAG TCCGGCTGCC CACCCATGGC AGGGTCTACC 1260 

AGAAGGGCCA CGAGCTGGTG TTGGCCAATA TTGCTGAAAG TGATGCTGGT GTCTACACCT 1320 

GCCACGCGGC CAACCTGGCT GGTCAGCGGA GACAGGATGT CAACATCACT GTGGCCACTG 1380 

TGCCCTCCTG GCTGAAGAAG CCCCAAGACA GCCAGCTGGA GGAGGGCAAA CCCGGCTACT 1440 

TGGATTGCCT GACCCAGGCC ACACCAAAAC CTACAGTTGT CTGGTACAGA AACCAGATGC 1500 

TCATCTCAGA GGACTCACGG TTCGAGGTCT TCAAGAATGG GACCTTGCGC ATCAACAGCG 1560 

TGGAGGTGTA TGATGGGACA TGGTACCGTT GTATGAGCAG CACCCCAGCC GGCAGCATCG 1620 

AGGCGCAAGC CCGTGTCCAA GTGCTGGAAA AGCTCAAGTT CACACCACCA CCCCAGCCAC 1680 

AGCAGTGCAT GGAGTTTGAC AAGGAGGCCA CGGTGCCCTG TTCAGCCACA GGCCGAGAGA 1740 

AGCCCACTAT TAAGTGGGAA CGGGCAGATG GGAGCAGCCT CCCAGAGTGG GTGACAGACA 1800 

ACGCTGGGAC CCTGCATTTT GCCCGGGTGA CTCGAGATGA CGCTGGCAAC TACACTTGCA 1860 

TTGCCTCCAA CGGGCCGCAG GGCCAGATTC GTGCCCATGT CCAGCTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAGTGGAA CCAGAGCGTA CGACTGTGTA CCAGGGCCAC ACAGCCCTAC 1980 

TGCAGTGCGA GGCCCAGGGG GACCCCAAGC CGCTGATTCA GTGGAAAGGC AAGGACCGCA 2040 

TCCTGGACCC CACCAAGCTG GGACCCAGGA TGCACATCTT CCAGAATGGC TCCCTGGTGA 2100 

TCCATGACGT GGCCCCTGAG GACTCAGGCC GCTACACCTG CATTGCAGGC AACAGCTGCA 2160 

ACATCAAGCA CACGGAGGCC CCCCTCTATG TCGTGGACAA GCCTGTGCCG GAGGAGTCGG 2220 

AGGGCCCTGG CAGCCCTCCC CCCTACAAGA TGATCCAGAC CATTGGGTTG TCGGTGGGTG 2280 

CCGCTGTGGC CTACATCATT GCCGTGCTGG GCCTCATGTT CTACTGCAAG AAGCGCTGCA 2340 

AAGCCAAGCG GCTGCAGAAG CAGCCCGAGG GCGAGGAGCC AGAGATGGAA TGCCTCAACG 2400 

GAGGGCCTTT GCAGAACGGG CAGCCCTCAG CAGAGATCCA AGAAGAAGTG GCCTTGACCA 2460 

GCTTGGGCTC CGGCCCCGCG GCCACCAACA AACGCCACAG CACAAGTGAT AAGATGCACT 2520 

TCCCACGGTC TAGCCTGCAG CCCATCACCA CGCTGGGGAA GAGTGAGTTT GGGGAGGTGT 2580 

TCCTGGCAAA GGCTCAGGGC TTGGAGGAGG GAGTGGCAGA GACCCTGGTA CTTGTGAAGA 2640 

GCCTGCAGAC GAAGGATGAG CAGCAGCAGC TGGACTTCCG GAGGGAGTTG GAGATGTTTG 2700 

GGAAGCTGAA CCACGCCAAC GTGGTGCGGC TCCTGGGGCT GTGCCGGGAG GCTGAGCCCC 2760 

ACTACATGGT GCTGGAATAT GTGGATCTGG GAGACCTCAA GCAGTTCCTG AGGATTTCCA 2820 

AGAGCAAGGA TGAAAAATTG AAGTCACAGC CCCTCAGCAC CAAGCAGAAG GTGGCCCTAT 2880 

GCACCCAGGT AGCCCTGGGC ATGGAGCACC TGTCCAACAA CCGCTTTGTG CATAAGGACT 2940 

TGGCTGCGCG TAACTGCCTG GTCAGTGCCC AGAGACAAGT GAAGGTGTCT GCCCTGGGCC 3000 

TCAGCAAGGA TGTGTACAAC AGTGAGTACT ACCACTTCCG CCAGGCCTGG GTGCCGCTGC 3060 

GCTGGATGTC CCCCGAGGCC ATCCTGGAGG GTGACTTCTC TACCAAGTCT GATGTCTGGG 3120 

CCTTCGGTGT GCTGATGTGG GAAGTGTTTA CACATGGAGA GATGCCCCAT GGTGGGCAGG 3180 

CAGATGATGA AGTACTGGCA GATTTGCAGG CTGGGAAGGC TAGACTTCCT CAGCCCGAGG 3240 

GCTGCCCTTC CAAACTCTAT CGGCTGATGC AGCGCTGCTG GGCCCTCAGC CCCAAGGACC 3300 

GGCCCTCCTT CAGTGAGATT GCCAGCGCCC TGGGAGACAG CACCGTGGAC AGCAAGCCCT 3360 

GAGGAGGGAG CCOGCTCAGG ATGGCCTGGG CAGGGGAGGA CATCTCTAGA GGGAAGCTCA 3420 
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CAGCATGATG GGCAAGATCC CTGTCCTCCT GGGCCCTGAG GTGCCCTAGT GCAACAGGCA 3480 

TTGCTGAGGT CTGAGCAGGG CCTGGCCTTT CCTCCTCTTC CTCACCCTCA TCCTTTGGGA 3540 

GGCTGACTTG GACCCAAACT GGGCGACTAG GGCTTTGAGC TGGGCAGTTT CCCCTGCCAC 3600 

CTCTTCCTCT ATCAGGGACA GTGTGGGTGC CACAGGTAAC CCCAATTTCT GGCCTTCAAC 3660 

TTCTCCCCTT GACCGGGTCC AACTCTGCCA CTCATCTGCC AACTTTGCCT GGGGAGGGCT 3720 

AGGCTTGGGA TGAGCTGGGT TTGTGGGGAG TTCCTTAATA TTCTCAAGTT CTGGGCACAC 3780 

AGGGTTAATG AGTCTCTTGC CCACTGGTCC ACTTGGGGGT CTAGACCAGG ATTATAGAGG 3840 

ACACAGCAAG TGAGTCCTCC CCACTCTGGG CTTGTGCACA CTGACCCAGA CCCACGTCTT 3900 

CCCCACCCTT CTCTCCTTTC CTCATCCTAA GTGCCTGGCA GATGAAGGAG TTTTCAGGAG 3960 

CTTTTGACAC TATATAAACC GCCCTTTTTG TATGCACCAC GGGCGGCTTT TATATGTAAT 4020 

TGCAGCGTGG GGTGGGTGGG CATGGGAGGT AGGGGTGGGC CCTGGAGATG AGGAGGGTGG 4060 

GCCATCCTTA CCCCACACTT TTATTGTTGT CGTTTTTTGT TTGTTTTGTT T T TTTG r nT 4140 
TTTACACTCG CTGCTCTCAA TAAATAAGCC TTTTTTA 



SEQ ID NO;290 OBI6 Protein sequence: 
Protein Accession*: NP.002812 

1 11 21 31 41 51 

I I I I I I 

MGAARGSPAR PRRLPLLSVL LLPLLGGTQT AIVFIKQPSS QDALQGRRAL LRCEVEAPGP 60 

VHVYWLLDGA PVQDTERRFA QGSSLSFAAV DRLQDSGTFQ CVARDDVTGE EARSANASFN 120 

IKWIEAGPW LKHPASEAEI QPQTQVTLRC HIDGHPRPTY QWFRDGTPLS DGQSNHTVSS 180 

KERNLTLRPA GPEHSGLYSC CAHSAFGQAC SSQNFTLSIA DESFARWLA PQDVWARYE 240 

EAMFHCQFSA QPPPSLQWLP EDETPITNRS RPPHLRRATV FANGSLLLTQ VRPRNAGIYR 300 

CIGOGQRGPP IILEATLHLA EIEDMPLFEP RVFTAGSEER VTCLPPKGLP EPSVWWEHAG 360 

VRI/PTHGRVY QKGHELVLAN IAESDAGVYT CHAANLAGQR RQDVNITVAT VPSWLKKPQD 420 

SQLEEGKPGY LDCLTQATPK PTWWYRNQM LISEDSRFEV FKNGTLRINS VEVYDGTWYR 480 

CMSSTPAGSI EAQARVQVLE KLKFTPPPQP QQCMEFDKEA TVPCSATGRE KPTIKWERAD 540 

GSSLPEWVTD NAGTLHFARV TRDDAGNYTC IASNGPQGQI RAHVQLTVAV FITFKVEPER 600 

TTVYQGHTAL LQCEAQGDPK PLIQWKGKDR ILDPTKLGPR MHIFQNGSLV IHDVAPEDSG 660 

RYTCIAGNSC NIKHTEAPLY WDKPVPEES EGPGSPPPYK MIQTIGLSVG AAVAYIIAVL 720 

GLMFYCKKRC KAKRLQKQPE GEEPEMECLN GGPLQNGQPS AEIQEEVALT SLG5GPAATN 780 

KRHSTSDKMH FPRSSLQPIT TLGKSEFGEV FLAKAQGLEE GVAETLVLVK SLQTKDEQQQ 840 

LDFRRELEMF GKLNHANWR LLGLCREAEP HYMVLEYVDL GDLKQFLRIS KSKDEKLKSQ 900 

PLSTKQKVAL CTQVALGMEH LSNNRFVHKD LAARNCLVSA QRGVKVSALG LSKDVYNSEY 960 

YHFRQAWVPL RWMSPEAILE GDFSTKSDVW AFGVLHWEVF THGEMPHGGQ ADDEVLADLQ 1020 
AGKARLPQPE GCP5KLYRLH QRCWALSPKD RPSFSEIASA LGDSTVDSKP 



SEQ ID NO:291 AAB1 DNA SEQUENCE 

Nucleic Acid Accession #: NM.002205 

Coding sequence: 1-3150 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGGGAGCC GGACGCCAGA GTCCCCTCTC CACGCCGTGC AGCTGCGCTG GGGCCCCCGG 60 

CGCCGACCCC CGCTSSTGCC GCTGCTGTTG CTGCTSSTGC CGCCGCCACC CAGGGTCGGG 120 

GGCTTCAACT TAGACGCGGA GGCCCCAGCA GTACTCTCGG GGCCCCCGGG CTCCTTCTTC 180 

GGATTCTCAG TGGAGTTTTA CCGGCCGGGA ACAGACGGGG TCAGTGTGCT GGTGGGAGCA 240 

CCCAAGGCTA ATACCAGCCA GCCAGGAGTG CTGCAGGGTG GTGCTGTCTA CCTCTGTCCT 300 

TGGGGTGCCA GCCCCACACA GTGCACCCCC ATTGAATTTG ACAGCAAAGG CTCTCGGCTC 360 

CTGGAGTCCT CACTGTCCAG CTCAGAGGGA GAGGAGCCTG TGGAGTACAA GTCCTTGCAG 420 

TGGTTCGGGG CAACAGTTCG AGCCCATGGC TCCTCCATCT TGGCATGCGC TCCACTGTAC 480 

AGCTGGCGCA CAGAGAAGGA GCCACTGAGC GACCCCGTGG GCACCTGCTA CCTCTCCACA 540 

GATAACTTCA CCCGAATTCT GGAGTATGCA CCCTGCCGCT CAGATTTCAG CTGGGCAGCA 600 

GGACAGGGTT ACTGCCAAGG AGGCTTCAGT GCCGAGTTCA CCAAGACTGG CCGTGTGGTT 660 

TTAGGTGGAC CAGGAAGCTA TTTCTGGCAA GGCCAGATCC TGTCTGCCAC TCAGGAGCAG 720 

ATTGCAGAAT CTTATTACCC CGAGTACCTG ATCAACCTGG TTCAGGGGCA GCTGCAGACT 780 

CGCCAGGCCA GTTCCATCTA TGATGACAGC TACCTAGGAT ACTCTGTGGC TGTTGGTGAA 840 

TTCAGTGGTG ATGACACAGA AGACTTTGTT GCTGGTGTGC CCAAAGGGAA CCTCACTTAC 900 

GGCTATGTCA CCATCCTTAA TGGCTCAGAC ATTCGATCCC TCTACAACTT CTCAGGGGAA 960 

CAGATGGCCT CCTACTTTGG CTATGCAGTG GCCGCCACAG ACGTCAATGG GGACGGGCTG 1020 

GATGACTTGC TGGTGGGGGC ACCCCTGCTC ATGGATCGGA CCCCTGACGG GCGGCCTCAG 1080 

GAGGTGGGCA GGGTCTACGT CTACCTGCAG CACCCAGCCG GCATAGAGCC CACGCCCACC 1140 

CTTACCCTCA CTGGCCATGA TGAGTTTGGC CGATTTGGCA GCTCCTTGAC CCCCCTCGGG 1200 

GACCTGGACC AGGATGGCTA CAATGATGTG GCCATCGGGG CTCCCTTTGG TGGGGAGACC 1260 

CAGCAGGGAG TAGTGTTTGT ATTTCCTGGG GGCCCAGGAG GGCTGGGCTC TAAGCCTTCC 1320 

CAGGTTCTGC AGCCCCTGTG GGCAGCCAGC CACACCCCAG ACTTCTTTGG CTCTGCCCTT 1380 

CGAGGAGGCC GAGACCTGGA TGGCAATGGA TATCCTGATC TGATTGTGGG GTCCTTTGGT 1440 

GTGGACAAGG CTGTGGTATA CAGGGGCCGC CCCATCGTGT CCGCTAGTGC CTCCCTCACC 1500 

ATCTTCCCCG CCATGTTCAA CCCAGAGGAG CGGAGCTGCA GCTTAGAGGG GAACCCTGTG 1560 

GCCTGCATCA ACCTTAGCTT CTGCCTCAAT GCTTCTGGAA AACACGTTGC TGACTCCATT 1620 

G G TTTCACAG TGGAACTTCA GCTGGACTGG CAGAAGCAGA AGGGAGGGGT ACGGCGGGCA 1680 

CTGTTCCTGG CCTCCAGGCA GGCAACCCTG ACCCAGACCC TGCTCATCCA GAATGGGGCT 1740 

CGAGAGGATT GCAGAGAGAT GAAGATCTAC CTCAGGAACG AGTCAGAATT TCGAGACAAA 1800 

CTCTCGCCGA TTCACATCGC TCTCAACTTC TCCTTGGACC CCCAAGCCCC AGTGGACAGC 1860 

CACGGCCTCA GGCCAGCCCT ACATTATCAG AGCAAGAGCC GGATAGAGGA CAAGGCTCAG 1920 

ATCTTGCTGG ACTGTGGAGA AGACAACATC TGTGTGCCTG ACCTGCAGCT GGAAGTGTTT 1980 
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GGGGAGCAGA ACCATGTGTA CCTGGGTGAC AAGAATGCCC TGAACCTCAC TTTCCATGCC 2040 

CAGAATGTGG GTGAGGGTGG CGCCTATGAG GCTGAGCTTC GGGTCACCGC CCCTCCAGAG 2100 

GCTGAGTACT CAGGACTCGT CAGACACCCA GGGAACTTCT CCAGCCTGAG CTGTGACTAC 2160 

TTTGCCGTGA ACCAGAGCCG CCTGCTGGTG TGTGACCTGG GCAACCCCAT GAAGGCAGGA 2220 

GCCAGTCTGT GGGGTGGCCT TCGGTTTACA GTCCCTCATC TCCGGGACAC TAAGAAAACC 2280 

ATCCAGTTTG ACTTCCAGAT CCTCAGCAAG AATCTCAACA ACTCGCAAAG CGACGTGGTT 2340 

TCCTTTCGGC TCTCCGTGGA GGCTCAGGCC CAGGTCACCC TGAACGGTGT CTCCAAGCCT 2400 

GAGGCAGTGC TATTOCCAGT AAGCGACTGG CATCCCCGAG ACCAGCCTCA GAAGGAGGAG 2460 

GACCTGGGAC CTGCTGTCCA CCATGTCTAT GAGCTCATCA ACCAAGGCCC CAGCTCCATT 2520 

AGCCAGGGTG TGCTGGAACT CAGCTGTCCC CAGGCTCTGG AAGGTCAGCA GCTCCTATAT 2580 

GTGACCAGAG TTACGGGACT CAACTGCACC ACCAATCACC CCATTAACCC AAAGGGCCTG 2640 

GAGTTGGATC CCGAGGGTTC CCTGCACCAC CAGCAAAAAC GGGAAGCTCC AAGCCGCAGC 2700 

TCTGCTTCCT CGGGACCTCA GATCCTGAAA TGCCCGGAGG CTGAGTGTTT CAGGCTGCGC 2760 

TGTGAGCTCG GGCCCCTGCA CCAACAAGAG AGCCAAAGTC TGCAGTTGCA TTTCCGAGTC 2820 

TGGGCCAAGA CTTTCTTGCA GCGGGAGCAC CAGCCATTTA GCCTGCAGTG TGAGGCTGTG 2880 

TACAAAGCCC TGAAGATGCC CTACCGAATC CTGCCTCGGC AGCTGCCCCA AAAAGAGCGT 2940 

CAGGTGGCCA CAGCTGTGCA ATGGACCAAG GCAGAAGGCA GCTATGGCGT CCCACTGTGG 3000 

ATCATCATCC TAGCCATCCT GTTTGGCCTC CTGCTCCTAG GTCTACTCAT CTACATCCTC 3060 

TACAAGCTTG GATTCTTCAA ACGCTCCCTC CCATATGGCA CCGCCATGGA AAAAGCTCAG 3120 
CTCAAGCCTC CAGCCACCTC TGATGCCTGA 



SEQIPNO:292AAB1 Protein sequence: 
Protein Accession #: NP JH2196 

1 11 21 31 41 51 

I I I I I I 

MGSRTPESPL RAVQLRWGPR RRPPLLPLLL LLLPPPPRVG GFNLDAEAPA VLSGPPGSFF 60 

GFSVEFYRPG TDGVSVLVGA PKANTSQPGV LQGGAVYLCP WGASPTQCTP IEFDSKGSRL 120 

LESSLSSSEG EEPVEYKSLQ WFGATVRAHG SSILACAPLY SWRTEKEPLS DPVGTCYLST 180 

DNFTRILBYA PCRSDFSWAA GQGYCQGGFS AEFTKTGRW LGGPGSYFWQ GQI LSATQEQ 240 

IAESYYPEYL LNLVQGQLCT RQASSIYDDS YLGYSVAVGE FSGDDTEDFV AGVPKGNLTY 300 

GYVTILNGSD IRSLYNFSGE QMASYFGYAV AATDVNGDGL DDLLVGAPLL MDRTPDGRPQ 360 

EVGRVYVYLQ HPAGIEPTPT LTLTGHDEFG RFGSSLTPLG DLDQDGYNDV AIGAPFGGET 420 

QQGWFVFPG GPGGLGSRPS QVLQPLWAAS HTPDFFGSAL RGGRDLDGNG YPDLIVGSFG 480 

VDKAWYRGR PIVSA5ASLT IFPAMFNPEE RSCSLEGNPV ACINLSPCLK ASGKHVADSI 540 

GFTVELQLDW QKQKGGVRRA LFLASRQATL TQTLLIQNGA REDCREMKIV LRNESEFRDK 600 

LSPIHIALNF SLDPQAPVDS HGLRPALHYQ SKSRIEDKAQ ILLDCGEDNI CVPDLQLEVF 660 

GEQNHVYLGD KNALNLTFHA QNVGEGGAYE AELRVTAPPE AEYSGLVRHP GNFSSLSCDY 720 

FAVNQSRLLV CDLGNPMKAG ASLWGGLRFT VPHLRDTKKT IQFDFQILSK NLNNSQSDW 780 

SFRLSVEAQA QVTLNGVSKP EAVLFPVSDW HPRDQPQKEE DLGPAVHHVY ELINQGPSSI 840 

SQGVLELSCP QALEGQQLLY VTRVTGUJCT TEJHPINPKGL ELDPEGSLHH QQKREAPSRS 900 

SASSGPQILK CPEAECFRLR CE1X3PLHQQE SQSLQLHFRV WAKTFLQREH QPFSLQCEAV 960 

YKALKMPYRI LPRQLPQKER QVATAVQWTK AEGSYGVPLW IIILAILFGL LLLGLLIYIL 1020 
YKLGFFKRSL PYGTAMEKAQ LKPPATSDA 



SEQ ID NCh293 LBH4 DNA SEQUENCE 

Nudelc Add Accession t: BC001291 
50 Coding sequence: 44-541 (start and stop codons are underlined) 

I 11 21 31 41 51 
« I I I I I I 

J J GGGGGCGCCG CGCGCTGACC CTCCCTGGGC ACCGCTGGGG ACGATGGCGC TGCTCGCCTT 60 
GCTGCTGGTC GTGGCCCTAC CGCGGGTGTG GACAGACGCC AACCTGACTG CGAGACAACG 120 
AGATCCAGAG GACTCCCAGC GAACGGACGA GGGTGACAAT AGAGTGTGGT GTCATGTTTG 180 
TGAGAGAGAA AACACTTTCG AGTGCCAGAA CCCAAGGAGG TGCAAATGGA CAGAGCCATA 240 
CTGCGTTATA GCGGCCGTGA AAATATTTCC ACG'1'1 111 1C ATGGTTGCGA AGCAGTGCTC 300 

OU CGCTGGTTGT GCAGCGATGG AGAGACCCAA GCCAGAGGAG AAGCGGTTTC TCCTGGAAGA 360 
GCCCATGCCC TTCTTTTACC TCAAGTGTTG TAAAATTCGC TACTGCAATT TAGAGGGGCC 420 
ACCTATCAAC TCATCAGTGT TCAAAGAATA TGCTGGGAGC ATGGGTG AGA GCTGTGGTGG 480 
GCTGTGGCTG GCCATCCTCC TCCTGCTGGC CTCCATTGCA GCCGGCCTCA GCCTGTCTTQ 540 

- _ AGCCACGGGA CTGCCACAGA CTGAGCCTTC CGGAGCATGG ACTCGCTCCA GACCGTTGTC 600 

65 ACCTGTTGCA TTAAACTTGT TTTCTGTTG A TTACCTCTTG GTTTG ACTTC CCAGGGTCTT 660 

GGGATGGGAG AGTGGGGATC AGGTGCAGTT GGCTCTTAAC CCTCAAGGGT TCTTTAACTC 720 
ACATTCAGAG GAAGTCCAGA TCTCCTGAGT AGTGATTTTG GTGACAAGTT TTTCTCTTTG 780 
AAATCA AACC TTGTAACTCA TTTATTGCTG ATGGCCACTC TTTTCCTTGA CTCCCCTCTG 840 

_^ CCTCTGAGGG CTTCAGTATT GATGGGGAGG GAGGCCTAAG TACCACTCAT GGAGAGTATG 900 

70 TGCTG AG ATG CTTCCGACCT TTCAGGTGAC GCAGGAACAC TGGGGGAGTC TGAATG ATTG 960 
GGGTGAAGAC ATCCCTGGAG TGAAGGACTC CTCAGCATGG GGGGCAGTGG GGCACACGTT 1020 
AGGGCTGCCC CCATTCCAGT GGTGGAGGCG CTGTGGATGG CTGCTTTTCC TCAACCTTTC 1080 
CTACCAGATT CCAGG AGGCA GAAG ATA ACT AATTGTGTTG AAG AAACTTA GACTTCACCC 1 140 
ACCAGCTGGC ACAGGTGCAC AGATTCATAA ATTCCCACAC GTGTGTGTTC AACATCTGAA 1200 

75 ACTTAGGCCA AGTAGAGAGC ATCAGGGTAA ATGGCGTTCA TTTCTCTGTT AAGATGCAGC 1260 
CATCCATGGG GAGCTGAGAA ATCAGACTCA AAGTTCCACC AAAAACAAAT ACAAGGGGAC 1320 
TTCAAAAGTT CACGAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 
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SEP ID NO:294 LBH4 Protein sequence: 
Protein Accession I: AAK01291 



1 11 21 31 41 51 
1(1111 

MAULALLLW ALPRVWTDAN LTARQRDPED SQRTDEGDNR VWCHVCEREN TFECQNPRRC 60 
KWTEPYCVIA A VKIFPRFFM VAKQCSAGCA AMERPKPEEK RFLLEEPMPF FYLKCCK1RY 120 
CNLEGPPINS SVFKEYAGSM GESCGGLWLA ILLLLASLAA GLSLS 



It is understood that the examples described above in no way serve to limit the 
true scope of this invention, but rather are presented for illustrative purposes. All 
publications, sequences of accession numbers, and patent applications cited in this 
specification are herein incorporated by reference as if each individual publication or patent 
application were specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMED IS: 

1 LA method of detecting a prostate cancer-associated transcript in a cell 

2 from a patient, the method comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

4 as shown in Tables 1-16. 

1 2. The method of claim 1 , wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1-16. 

1 3. The method of claim 1, wherein the biological sample is a tissue 

2 sample. 

1 4. The method of claim 1, wherein the biological sample comprises 

2 isolated nucleic acids. 

. 1 5. The method of claim 4, wherein the nucleic acids are mRNA. 

1 6. The method of claim 4, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 7. The method of claim 1 , wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-16. 

1 8. The method of claim 1, wherein the polynucleotide is labeled. 

1 9. The method of claim 8, wherein the label is a fluorescent label. 

1 10. The method of claim 1 , wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 11. The method of claim 1 , wherein the patient is undergoing a therapeutic 

2 regimen to treat prostate cancer. 

1 12. The method of claim 1, wherein the patient is suspected of having 

2 prostate cancer. 
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1 13. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated transcript in the 

6 biological sample by contacting the biological sample with a polynucleotide that selectively 

7 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16, 

8 thereby monitoring the efficacy of the therapy. 

1 14. The method of claim 13, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated transcript to a level of the prostate cancer- 

3 associated transcript in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 15. The method of claim 13, wherein the patient is a human. 

1 16. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated antibody in the 

6 biological sample by contacting the biological sample with a polypeptide encoded by a 

7 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

8 as shown in Tables 1-16, wherein the polypeptide specifically binds to the prostate cancer- 

9 associated antibody, thereby monitoring the efficacy of the therapy. 

1 17. The method of claim 16, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated antibody to a level of the prostate cancer- 

3 associated antibody in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 18. The method of claim 16, wherein the patient is a human. 
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1 19. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated polypeptide in the 

6 biological sample by contacting the biological sample with an antibody, wherein the antibody 

7 specifically binds to a polypeptide encoded by a polynucleotide that selectively hybridizes to 

8 a sequence at least 80% identical to a sequence as shown in Tables 1-16, thereby monitoring 

9 the efficacy of the therapy. 

1 20. The method of claim 19, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated polypeptide to a level of the prostate cancer- 

3 associated polypeptide in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

21. The method of claim 19, wherein the patient is a human. 

22. An isolated nucleic acid molecule consisting of a polynucleotide 
sequence as shown in Tables 1-16. 

23. The nucleic acid molecule of claim 22, which is labeled. 

24. The nucleic acid of claim 23, wherein the label is a fluorescent label 

25. An expression vector comprising the nucleic acid of claim 22. 

26. A host cell comprising the expression vector of claim 25. 

27. An isolated polypeptide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown in Tables 1-16. 

28. An antibody that specifically binds a polypeptide of claim 27. 

29. The antibody of claim 28, further conjugated to an effector component. 
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1 30. The antibody of claim 29, wherein the effector component is a 

2 fluorescent label. 

1 31. The antibody of claim 29, wherein the effector component is a 

2 radioisotope or a cytotoxic chemical. 

1 32. The antibody of claim 29, which is an antibody fragment. 

1 33. The antibody of claim 29, which is a humanized antibody 

1 34. A method of detecting a prostate cancer cell in a biological sample 

2 from a patient, the method comprising contacting the biological sample with an antibody of 

3 claim 28. 

1 35. The method of claim 34, wherein the antibody is further conjugated to 

2 an effector component. 

1 36. The method of claim 35, wherein the effector component is a 

2 fluorescent label. 

1 37. A method of detecting antibodies specific to prostate cancer in a 

2 patient, the method comprising contacting a biological sample from the patient with a 

3 polypeptide encoded by a nucleic acid comprises a sequence from Tables 1-16. 

1 38. A method for identifying a compound that modulates a prostate cancer- 

2 associated polypeptide, the method comprising the steps of: 

3 (i) contacting the compound with a prostate cancer-associated polypeptide, the 

4 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 

5 80% identical to a sequence as shown in Tables 1-16; and 

6 (ii) determining the functional effect of the compound upon the polypeptide. 

1 39. The method of claim 38, wherein the functional effect is a physical 

2 effect. 
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1 40. The method of claim 38, wherein the functional effect is a chemical 

2 effect. 

1 41. The method of claim 38, wherein the polypeptide is expressed in a 

2 eukaryotic host cell or cell membrane. 

1 42. The method of claim 38, wherein the functional effect is determined by 

2 measuring ligand binding to the polypeptide. 

1 43. The method of claim 38, wherein the polypeptide is recombinant. 

1 44. A method of inhibiting proliferation of a prostate cancer-associated 

2 cell to treat prostate cancer in a patient, the method comprising the step of administering to 

3 the subject a therapeutically effective amount of a compound identified using the method of 

4 claim 38. 

1 45. The method of claim 44, wherein the compound is an antibody. 

1 46. The method of claim 45, wherein the patient is a human. 

1 47. A drug screening assay comprising the steps of 

2 (i) administering a test compound to a mammal having prostate cancer or a 

3 cell isolated therefrom; 

4 (ii) comparing the level of gene expression of a polynucleotide that selectively 

5 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16 in a 

6 treated cell or mammal with the level of gene expression of the polynucleotide in a control 

7 cell or mammal, wherein a test compound that modulates the level of expression of the 

8 polynucleotide is a candidate for the treatment of prostate cancer. 

1 48. The assay of claim 47, wherein the control is a mammal with prostate 

2 cancer or a cell therefrom that has not been treated with the test compound. 

1 49. The assay of claim 47, wherein the control is a normal cell or mammal. 
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1 50. A method for treating a mammal having prostate cancer comprising 

2 administering a compound identified by the assay of claim 47. 

1 5 1 . A pharmaceutical composition for treating a mammal having prostate 

2 cancer, the composition comprising a compound identified by the assay of claim 47 and a 

3 physiologically acceptable excipient. 

1 52. The method according to claim 1, wherein said biological sample is 



2 contacted with a plurality of polynucleotides comprising a first polynucleotide that 

3 selectively hybridizes to a sequence at least 80% identical to a first sequence as shown in 

4 Tables 1-16; and a second polynucleotide that selectively hybridizes to a second sequence at 

5 least 80% identical to a second sequence as shown in Tables 1-16. 



1 53. A method according to claim 52, wherein the plurality of 

2 polynucleotides comprises a third polynucleotide that selectively hybridizes to a sequence at 

3 least 80% identical to a third sequence as shown in Tables 1-16.. 

1 54. A method of detecting a prostate cancer associated transcript, the 

2 method comprising contacting a biological sample from the patient with a plurality of 

3 polynucleotides wherein at least two of said polynucleotides selectively hybridize to a 

4 difference sequence at least 80% identical to a sequence as shown in Tables 1-16. 

1 55. A method of detecting a prostate cancer, the method comprising the 

2 steps of: 

3 (i) providing a biological sample from a patient; 

4 (ii) contacting the biological sample with a first polynucleotide that selectively 



5 hybridizes to a sequence at least 80% identical to a first sequence as shown in Tables 1-16 to 

6 determine the level of a prostate cancer-associated transcript in the biological sample; and 

7 with a second polynucleotide that selectively hybridizes to a second sequence at least 80% 

8 identical to a sequence not shown in Tables 1-16; wherein the expression of said second 

9 sequence is not substantially changed in prostate cancer, to detemine the level of expression 
10 of a control transcript in the biological sample; 
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1 1 (iii) comparing the level of the prostate cancer-associated transcript to a level 

12 of the normal tissue associated transcript in the biological sample. 

1 56. A method of quantitating a prostate cancer-associated transcript in a 

2 cell from a patient, the method comprising contacting a biological sample from the patient 

3 with a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 

4 sequence as shown in Tables 1-16. 

1 57. The method of claim 56, wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1-16. 

1 58. The method of claim 56, wherein the biological sample is a tissue 

2 sample. 

1 59. The method of claim 56, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 60. The method of claim 56, wherein the nucleic acids are mRNA. 

1 61. The method of claim 59, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 62. The method of claim 56, wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-16. 

1 63. The method of claim 56, wherein the polynucleotide is labeled. 

1 64. The method of claim 63, wherein the label is a fluorescent label. 

1 65. The method of claim 56, wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 . 66. The method of claim 56, wherein the patient is undergoing a 

2 therapeutic regimen to treat metastatic prostate cancer. 

1 67. The method of claim 56, wherein the patient is suspected of having 

2 metastatic prostate cancer. 
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1 68. A biochip comprising a plurality of polynucleotides that selectively 

2 hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1-16. 

1 69. A method of screening drug candidates comprising: 

2 i) providing a cell that expresses an expression profile gene selected from the 

3 group consisting of an expression profile gene set forth in Tables 1-16 or fragment thereof; 

4 ii) adding a drug candidate to said cell; and 

5 iii) determining the effect of said drug candidate on the expression of said 

6 expression profile gene. 

1 70. A method according to claim 59 wherein said determining comprises 

2 comparing the level of expression in the absence of said drug candidate to the level of 

3 expression in the presence of said drug candidate. 

1 SF 1277890 vl 
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